Skip to main content

1996 | Buch

The New Statistical Analysis of Data

verfasst von: T. W. Anderson, Jeremy D. Finn

Verlag: Springer New York

insite
SUCHEN

Über dieses Buch

The Nature of the Book This book is a text for a first course in statistical concepts and methods. It introduces the analysis of data and statistical inference and explains various methods in enough detail so that the student can apply them. Little mathematical background is required; only high school algebra is used. No mathematical proof is given in the body of the text, although algebraic demonstrations are given in appendices at the ends of some chapters. The exposition is based on logic, verbal explanations, figures, and numerical examples. The verbal and conceptual levels are higher than the mathematical leveL The concepts and methods of statistical analysis are illustrated by more than 100 interesting real-life data sets. Some examples are taken from daily life; many deal with the behavioral sciences, some with business, the health sciences, the physical sciences, and engineering. The exercises are of varying degrees of difficulty. This book is suitable for undergraduates in many majors and for grad­ uate students in the health sciences, behavioral sciences, and education. It has grown out of our experience over many years of teaching such courses. An earlier text by T. W. Anderson and S. L. Sclove, The Statistical Analysis of Data, had similar objectives and was of a similar nature.

Inhaltsverzeichnis

Frontmatter

Introduction

Frontmatter
1. The Nature of Statistics
Abstract
Statistics enters into almost every phase of life in some way. A daily news broadcast may start with a weather forecast and end with an analysis of the stock market. In a newspaper at hand we see in the first five pages stories on an increase in the wholesale price index, an increase in the number of habeas corpus petitions filed, new findings on mothers who smoke, the urgent need for timber laws, a state board plan for evaluation of teachers, popularity of new commuting buses, a desegregation plan, and sex bias. Each article reports some information, proposal, or conclusion based on the organization and analysis of numerical data.
T. W. Anderson, Jeremy D. Finn

Descriptive Statistics

Frontmatter
2. Organization of Data
Abstract
In Chapter 2 we start with the statistical information as it is obtained by the investigator; this information might be an instructor’s list of students and their grades, a record of the tax rates of counties in Florida, or the prices of Grade A large eggs in each of ten Chicago grocery stores averaged over the past 36 months. We refer to such statistical information as data, the recorded results of observation. After the collection of data, the next step in a statistical study is to organize the information in meaningful ways, often in the form of tables and graphs or charts. These displays or summaries are descriptions which help the investigator, as well as the eventual reader of the study, to understand the implications of the collected information. In later chapters we shall develop numerical descriptions that are more succinct than these tables and charts.
T. W. Anderson, Jeremy D. Finn
3. Measures of Location
Abstract
After a set of data has been collected, it must be organized and condensed or categorized for purposes of analysis. In addition to graphical summaries, numerical indices can be computed that summarize the primary features of the data set. One is an indicator of location or central tendency that specifies where the set of measurements is “located” on the number line; it is a single number that designates the center of a set of measurements. In this chapter we consider several indices of location and show how each of them tells us about a central point in the data.
T. W. Anderson, Jeremy D. Finn
4. Measures of Variability
Abstract
Although for some purposes an average may be a sufficient description of a set of data, usually more information about the data is needed. An important feature of statistical data is their variability—how much the measurements differ from individual to individual. In this chapter we discuss the numerical evaluation of variability. A synonym for variability is dispersion, and other terms are sometimes used for the same concept including “spread” or “scatter.”
T. W. Anderson, Jeremy D. Finn
5. Summarizing Multivariate Data: Association Between Numerical Scales
Abstract
Statistical data are often used to answer questions about relationships between variables. Chapters 2 through 4 of this book describe ways to summarize data on a single variable. In Chapters 5 and 6 methods are described for summarizing the. relationship or association between 2 or among 3 or more variables. Chapter 5 considers association among variables measured on numerical scales; Chapter 6 discusses two or more categorical variables.
T. W. Anderson, Jeremy D. Finn
6. Summarizing Multivariate Data: Association Between Categorical Variables
Abstract
Statistical data are used frequently to answer questions about the association of two or more variables. When the variables have numerical scales, association may be examined through scatter plots and the correlational techniques discussed in Chapter 5. In this chapter we discuss methods for examining relationships between and among categorical variables.
T. W. Anderson, Jeremy D. Finn

Probability

Frontmatter
7. Basic Ideas of Probability
Abstract
Each of us has some intuitive notion of what “probability” is. Everyday conversation is full of references to it: “He’ll probably return on Saturday.” “Maybe he won’t.” “The chances are she’ll forget.” “The odds on winning are small.”
T. W. Anderson, Jeremy D. Finn
8. Probability Distributions
Abstract
Statistical inference is discussed in Part Four of this book. Inference is the process of drawing conclusions about populations of interest from samples of data. In this chapter we introduce the terminology associated with population distributions. In Chapter 9 we present the theory used as a basis for drawing inferential conclusions. As the reader will see, these two sets of principles are closely related.
T. W. Anderson, Jeremy D. Finn
9. Sampling Distributions
Abstract
Statistical inference is the process of drawing conclusions about a population of interest from a sample of data. In order to develop and evaluate methods for using sample information to obtain knowledge of the population, it is necessary to know how closely a descriptive quantity such as the mean or the median of a sample resembles the corresponding population quantity. In this chapter the ideas of probability will be used to study the sample-to-sample variability of these descriptive quantities. The ways in which one sample differs from another, and thus how they are both likely to differ from the corresponding population value, is the key theoretical concept underlying statistical inference.
T. W. Anderson, Jeremy D. Finn

Statistical Inference

Frontmatter
10. Using a Sample to Estimate Characteristics of One Population
Abstract
In this section the theory developed in Part III is used to allow us to infer the characteristics of a population based on data from a sample. This is an essential part of statistical analysis because we often need to know about the parent population, but are not able to study every one of its members. For example, we might like to know what percentage of voters favor a particular political issue, but cannot survey all voters by phone; or we may need to know if a particular medication is effective, but cannot wait (or afford) to test it on every individual who contracts the disease before declaring it as effective or ineffective.
T. W. Anderson, Jeremy D. Finn
11. Answering Questions about Population Characteristics
Abstract
It is often the purpose of a statistical investigation to answer a yes-or-no question about some characteristic of a population. An election candidate, for example, may employ a pollster to determine whether the proportion of voters intending to vote for him does or does not exceed 1/2. The polio vaccine trial was designed so that medical researchers could decide whether the incidence rate of polio is or is not smaller in a population of persons inoculated with the vaccine than in a population of persons not inoculated with the vaccine. Industrial quality control involves determination as to whether the average strength, lifetime, or concentration of the product in each manufacturing batch does or does not fall within acceptable limits.
T. W. Anderson, Jeremy D. Finn
12. Differences Between Populations
Abstract
Frequently an investigator wishes to compare or contrast two populations—sets of individuals or objects. This may be done on the basis of a sample from each of the two populations, as when average incomes in two groups, average driving skills of males and females, or average attendance rates in two school districts are compared. The polio vaccine trial compared the incidence rate of polio in the hypothetical population of children who might be inoculated with the vaccine and the rate in the hypothetical population of those who might not be inoculated; the two groups of children observed were considered as samples from these respective (hypothetical) populations. This example illustrates an experiment in which a group receiving an experimental treatment is compared with a “control” group. Ideally, the control group is similar to the experimental group in every way except that its members are not given the treatment.
T. W. Anderson, Jeremy D. Finn
13. Variability in One Population and in Two Populations
Abstract
The expression, “A chain is only as strong as its weakest link,” may be construed as an admonition to consider the variability of the links as well as their average strength. In comparing distributions, averages alone are not always adequate. Figures 3.5 and 3.6 show two telephone waiting-time distributions with equal means (1.1 seconds) but very different shapes. In Section 4.3 we pointed out that their Standard deviations, 0.41 second and 0.69 second, were quite different. If the pupils in each of two school classes have mean IQs of 100, but Class A has a Standard deviation of 10, while Class B has a Standard deviation of 20, teaching the relatively homogeneous Class A may be very different from teaching the relatively heterogeneous Class B.
T. W. Anderson, Jeremy D. Finn

Statistical Methods for Other Problems

Frontmatter
14. Inference on Categorical Data
Abstract
In this chapter we present some methods for treatment of categorical data. The methods involve the comparison of a set of observed frequencies with frequencies specified by some hypothesis to be tested. In Section 14.1 the hypothesis is that one categorical variable has a specific distribution. A test of such a hypothesis is called a test of goodness of fit.
T. W. Anderson, Jeremy D. Finn
15. Simple Regression Analysis
Abstract
In this chapter we return to the statistical relationship between two quantitative variables. In Chapter 5 the correlation coefficient is described as a symmetrie index of strength of association. In this chapter we examine the directional relationship of two variables. In many instances one variable may have a direct effect on the other or may be used to predict the other. For example, sodium intake may affect blood pressure; rainfall influences crop yield; SAT scores may predict college grade averages; and parents’ heights may predict offsprings’ heights.
T. W. Anderson, Jeremy D. Finn
16. Comparison of Several Populations
Abstract
Throughout this book we have stressed the basic statistical concept of variability. When some measurement, such as height or aptitude for a particular job, is made on several individuals, the values vary from person to person. The variability of a quantitative scale is measured by its variance. If the set of individuals is stratified into more homogeneous groups, the variance of the measurements within the more homogeneous groups will be less than that of the measurements in the entire group; that is what “more homogeneous” means. For example, the variance of the heights of pupils in an elementary school is usually greater than the variance of heights of pupils in just the first grade, the variance in the second grade, and the variance in each of the other grades. At the same time, the average height of pupils also varies from grade to grade.
T. W. Anderson, Jeremy D. Finn
17. Sampling from Populations: Sample Surveys
Abstract
Much empirical data arises from experiments, in which the investigator interacts in some way with the units of observation and actually influences the conditions of the units leading to the measurements. Many other sets of data result from simply observing, that is, making a survey. It is to such investigations that we now turn our attention. Usually one cannot observe every individual in the population, and often this would not even be desirable, for many individuals are similar. One does not need to eat the whole bowl to learn how the soup tastes; a spoonful will suffice, provided that the soup has been adequately stirred. The “spoonful” is a sample from the bowl (population), and “stirring” corresponds to drawing a random sample.
T. W. Anderson, Jeremy D. Finn
Backmatter
Metadaten
Titel
The New Statistical Analysis of Data
verfasst von
T. W. Anderson
Jeremy D. Finn
Copyright-Jahr
1996
Verlag
Springer New York
Electronic ISBN
978-1-4612-4000-6
Print ISBN
978-1-4612-8466-6
DOI
https://doi.org/10.1007/978-1-4612-4000-6