Skip to main content
main-content

Über dieses Buch

It has been evident from many years of research work in the geohydrologic sciences that a summary of relevant past work, present work, and needed future work in multivariate statistics with geohydrologic applications is not only desirable, but is necessary. This book is intended to serve a broad scientific audience, but more specifi­ cally is geared toward scientists doing studies in geohydrology and related geo­ sciences.lts objective is to address both introductory and advanced concepts and applications of the multivariate procedures in use today. Some of the procedures are classical in scope but others are on the forefront of statistical science and have received limited use in geohydrology or related sciences. The past three decades have seen a significant jump in the application of new research methodologies that focus on analyzing large databases. With more general applications being developed by statisticians in various disciplines, multivariate quantitative procedures are evolving for better scientific applica­ tion at a rapid rate and now provide for quick and informative analyses of large datasets. The procedures include a family of statistical research methods that are alternatively called "multivariate analysis" or "multivariate statistical methods".

Inhaltsverzeichnis

Frontmatter

Introduction to General Statistical and Multivariate Concepts

Frontmatter

Chapter 1. General Concepts

Abstract
Statistics is that branch of mathematics which deals with the analysis of data, and is divided into descriptive statistics and inferential statistics (statistical inference). Multivariate statistics is an extension of univariate (one variable) or bivariate (two variables) statistics. It allows a single test instead of many different univariate and bivariate test when a large number of variables are being investigated.
Charles E. Brown

Chapter 2. Introduction to Multivariate Statistical Procedures

Abstract
The type of data to be studied is a deciding factor in all statistical methods and is very important in studies using multivariate statistics. Data may be classified as continuous or discrete, normal or non-normal, and based on scales of measurement such as ordinal, nominal, or other.
Charles E. Brown

Variable-Directed Procedures Based on Normal Distribution Assumptions

Frontmatter

Chapter 3. Correlation

Abstract
In many studies, there may be many independent and dependent variables that are measured and must be analyzed. The basis of multivariate procedures derives from an n-dimensional random variable, sometimes called an independent random vector composed of univariate random variables in the form of matrices.
Charles E. Brown

Chapter 4. Factor Analysis

Abstract
Factor analysis is used to form a subset of uncorrelated theoretical variables called factors that adequately explain the variation in the original variable set.
Charles E. Brown

Chapter 5. Canonical Correlation

Abstract
Canonical correlation is the analysis of the correlation between two sets of variables wherein the linear relationship between two sets of variables is maximized.
Charles E. Brown

Chapter 6. Multiple Regression

Abstract
The objective of multiple regression is to derive a linear relationship between a dependent variable and a group of independent variables.
Charles E. Brown

Chapter 7. Multivariate Analysis of Variance

Abstract
The multivariate analysis of variance (MANOVA) is a generalization of analysis of variance when there are multiple dependent variables, and tests whether mean differences among groups on a combination of dependent variables is a chance occurrence. A new dependent variable that maximizes group differences is created from the set of dependent variables by forming a linear combination of measured or observed dependent variables. Then an analysis of variance is performed on the new dependent variable by a test of variances.
Charles E. Brown

Chapter 8. Multivariate Analysis of Covariance

Abstract
The objective of multivariate analysis of covariance is to determine if there are statistically reliable mean differences that can be demonstrated among groups after adjusting the newly created variable (dependent variable) for differences on one or more covariates. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates.
Charles E. Brown

Variable-Directed Techniques not Based on Normal Distribution Assumptions

Frontmatter

Chapter 9. Principal Components

Abstract
Variables that are correlated to one another form factors or principal components related to an underlying or theoretical process.
Charles E. Brown

Individual-Directed Techniques Based on Normal Distribution Assumptions

Frontmatter

Chapter 10. Multiple Discriminant Analysis

Abstract
The objective of discriminant analysis is to determine group membership of samples from a group of predictors by finding linear combinations of the variables which maximize the differences between the populations being studied, with the objective of establishing a model to sort objects into their appropriate populations with minimal error.
Charles E. Brown

Individual-Directed Techniques not Based on Normal Distribution Assumptions

Frontmatter

Chapter 11. Cluster Analysis

Abstract
The objective of cluster analysis is to separate the groups based on measured characteristics so as to maximize distance between groups.
Charles E. Brown

Chapter 12. Multiple Logistic Regression

Abstract
The multiple logistic regression equation is based on the premise that the natural log of odds (logit) is linearly related to independent variables. The logit equation is the same as for the discriminant function and multiple regression equation with the dependent variable as the natural log of odds.
Charles E. Brown

Other Approaches to Explore Multivariate Data

Frontmatter

Chapter 13. Coefficient of Variation

Abstract
If the absolute dispersion is defined as the standard deviation, and the average is the mean, the relative dispersion is called the coefficient of variation (CV) or coefficient of dispersion. The relationship between mean and dispersion is very important in the geosciences and is expressed by the coefficient of variation according to:
$$CV\% = 100\sigma /mean$$
(13.1)
where a = standard deviation. The coefficient of variation is attractive as a statistical tool because it apparently permits the comparison of variates free from scale effects; i.e., it is dimensionless. However, it has appropriate meaning only if the data achieve ratio scale. The coefficient of variation can be plotted as a graph to compare data. A CV exceeding say about 30 percent is often indicative of problems in the data or that the experiment is out of control. Variates with a mean less than unity also provide spurious results and the coefficient of variation will be very large and often meaningless.
Charles E. Brown

Chapter 14. Correspondence Analysis

Abstract
Correspondence analysis is defined in some instances as a way of interpreting contingency tables that may be defined through principal components analysis (Mardia et al. 1979). In correspondence analysis used in this book, a factor is represented by the eigenvector of the normalized covariance or correlation matrix (Usunoff and Guzman-Guzman 1989). It can be further viewed as a simultaneous linear regression scheme with dual scaling, which allows the interpretation of both sample sites and variables in the same factor space. The points, i (sample) and the points j (variables) can be simultaneously reported on the planes associated with the factor axes. The proximity of a point j to a group of points i is taken as an indication that the variable actually characterizes this group of samples. The contribution, CR of the points i or j, in the variability accounted for by an axis (a) can be computed and this aids interpretation. The CR of whole points i or j amounts to one, according to calculations (Razack and Dazy 1990):
$$\sum\limits_i {C{R_\alpha }\left( i \right) = \sum\limits_j {C{R_\alpha }\left( j \right) = 1} }$$
(14.1)
for a given axis (a), and the quality (QT) which expresses the quality of representation of points on the axis (a) is given by:
$$\sum\limits_\alpha {Q{T_{\left( i \right)}} = \sum\limits_{_\alpha } {Q{T_{\left( j \right)}} = 1} }$$
(14.2)
for a given point i or j. (14.2)
Charles E. Brown

Chapter 15. Multivariate Probit Analysis

Abstract
Probit analysis is used in the environmental toxicology field as a procedure to study the dosage response relation in a population of biological organisms, where randomly chosen population members are exposed to various levels of applied stimulus and quantal response is assessed as either dead or alive. In some instances more than one organ or physiological system is affected by the stimulus leading to tests of so-called main effects and side effects.
Charles E. Brown

Multivariate Measures of Space, Distance, and Time

Frontmatter

Chapter 16. Multivariate Time Series Modeling

Abstract
Time series models have been applied to many environmental and geohydrological problems. In many instances, such models may be required to provide the most accurate forecasts possible. Before proceeding, a short review of methods will be given.
Charles E. Brown

Chapter 17. Multivariate Spatial Measures

Abstract
Multidimensional scaling is a technique for reducing dimensionality and is based on the distance between points. The result is a reduction in the dimensions used to represent the data with as little distortion as possible. It is similar in its objective to principal components analysis discussed earlier. If the methods employed use rank data, then it is called non-metric scaling otherwise, the terminology used is metric scaling. Studies concerning the analysis of distances between places on a map have received great attention in multidimensional scaling applications. The reader is referred to other references for additional insight (Chatfield and Collins 1980; Marriott 1974; and Mardia et al. 1979).
Charles E. Brown

Multivariate Data Preparation, Plotting, and Conclusions

Frontmatter

Chapter 18. Multivariate Data Preparation and Plotting

Abstract
Many types of data plots exist for exploring geohydrologic and other scientific data. Several types will be shown and described in the next sections.
Charles E. Brown

Chapter 19. Summary and Generalizations of Multivariate Quantitative Procedures

Abstract
Research problems lend themselves to analysis by a number of multivariate statistical applications. This text has attempted to address some of the applications and underlying basis for choosing a procedure or technique. Scientific parsimony is the essence of scientific reasoning and should always be adhered to. The explained variance in a dataset is a significant aspect of any structural analysis whether it be the explained variance of a row, explained variance of a group of measurements, or the explained variance of a factor or column. In any situation, if we can reduce the number of “explainer” variables or reduce the number of measurements to attain a true value or estimate, we have fulfilled the task of multivariate statistics. The examples chosen for review represent a broad spectrum of methodologies and techniques, by which scientific data can be parsimoniously analysed using model building and model fitting strategies.
Charles E. Brown

Backmatter

Weitere Informationen