Skip to main content

1992 | Buch

Breakthroughs in Statistics

Methodology and Distribution

herausgegeben von: Samuel Kotz, Norman L. Johnson

Verlag: Springer New York

Buchreihe : Springer Series in Statistics

insite
SUCHEN

Inhaltsverzeichnis

Frontmatter
Introduction to Pearson (1900) On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling

In 1984, the journal Science published a series of articles listing 20 important scientific breakthroughs of the 20th century: relativity, the quantum theory, and so on. One article in the series was devoted to the paper we are about to discuss. Written by Ian Hacking, it is referenced below. Karl Pearson’s paper is the fifth in a series of his early papers, most of which are concerned with the mathematical problems of biological evolution. Insofar as this paper is the first of the series that is not primarily concerned with biological problems, it could be taken to represent the break into modern, 20th century statistics.

G. A. Barnard
On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling

Let x1, x2 … x n be a system of deviations from the means of n variables with standard deviations σ1, σ2 … σ n and with correlations r12, r13, r23 … r n −1,n.

Karl Pearson
Introduction to Student (1908) The Probable Error of a Mean

Testing a hypothesis about the mean $$\xi $$ of a population on the basis of a sample X1, …, X n from that population was treated throughout the 19th century by a large-sample approach that goes back to Laplace. If the sample mean $${\bar X}$$ is considered to be the natural estimate of $$\xi $$, the hypothesis $$H:\xi = {\xi _0}$$ should be rejected when $${\bar X}$$ differs sufficiently from $${\xi _0}$$. Furthermore, since for large n the distribution of $$\sqrt n \left( {\bar X} \right. - \left. {{\xi _0}} \right)$$ / σ is approximately standard normal under H (where σ2 < ∞ is the variance of the X’s), this suggests rejecting H when 1$$\frac{{\sqrt n \left| {\bar X - \left. {{\xi _0}} \right|} \right.}}{\sigma }$$ exceeds the appropriate critical value calculated from that distribution.

E. L. Lehmann
The Probable Error of a Mean

Any experiment may be regarded as forming an individual of a “population” of experiments which might be performed under the same conditions. A series of experiments is a sample drawn from this population.

Student
Introduction to Fisher (1925) Statistical Methods for Research Workers

R.A. Fisher had an abiding interest in inference, which came out in many of his writings. The inaugural meeting of the British Region of the Biometric Society was notable for his far-reaching presidential address (Fisher, 1948) in which he saw statistical science as doing for inference what mathematics had done for deduction. Among his contributions to the subject, the most influential has surely been the concept of significance, widely used from the time of its appearance. Although immensely valuable as an intellectual tool, its widespread and uncritical use has arguably been harmful as well as beneficial.

S. C. Pearce
Statistical Methods for Research Workers

The prime object of this book is to put into the hands of research workers, and especially of biologists, the means of applying statistical tests accurately to numerical data accumulated in their own laboratories or available in the literature.

R. A. Fisher
Introduction to Fisher (1926) The Arrangement of Field Experiments

In 1919, the Director of Rothamsted Experimental Station, Sir John Russell, invited Ronald Aylmer Fisher, a young mathematician with interests in evolution and genetics, to join the small group of scientists at Rothamsted in order that [see Russell (1966, p. 327)] “after studying our records he should tell me whether they were suitable for proper statistical examination and might be expected to yield more information than we had extracted.” Fisher accepted the invitation and in a very short time Russell realized (loc. cit.) “that he was more than a man of great ability; he was in fact a genius who must be retained.” In the few years that followed, Fisher introduced the subdivision of sums of squares now known as an analysis of variance (anova) table (1923), derived the exact distribution of the (log of the) ratio of two independent chi-squared variates (1924), introduced the principles of blocking and randomization. as well as the randomized block, Latin square, and split-plot experiments, the latter with two anova tables (1925), promoted factorial experiments, and foreshadowed the notion of confounding (1926). Of course Fisher made many contributions to theoretical statistics over this same period [see Fisher (1922)], but the above relate directly to the design and analysis of field experiments, the topic of the paper that follows. It was an incredibly productive period for Fisher, with his ideas quickly transforming agricultural experimentation in Great Britain and more widely, and in major respects these ideas have remained the statistical basis of agricultural experimentation to this day.

T. P. Speed
The Arrangement of Field Experiments

The present position of the art of field experimentation is one of rather special interest. For more than fifteen years the attention of agriculturalists has been turned to the errors of field experiments. During this period, experiments of the uniformity trial type have demonstrated the magnitude and ubiquity of that class of error which cannot be ascribed to carelessness in measuring the land or weighing the produce, and which is consequently described as due to “soil heterogeneity”; much ingenuity has been expended in devising plans for the proper arrangement of the plots; and not without result, for there can be little doubt that the standard of accuracy has been materially, though very irregularly, raised. What makes the present position interesting is that it is now possible to demonstrate (a) that the actual position of the problem is very much more intricate than was till recently imagined, but that realising this (b) the problem itself becomes much more definite and (c) its solution correspondingly more rigorous.

R. A. Fisher
Introduction to Kolmogorov (1933) On the Empirical Determination of a Distribution

In 1933, A.N. Kolmogorov (1933a) published a short but landmark paper in the Italian Giornale dell’Istituto Italiano degli Attuari. He formally defined the empirical distribution function (EDF) and then enquired how close this would be to the true distribution F(x) when this is continuous. This leads naturally to the definition of what has come to be known as the Kolmogorov statistic (or sometimes the Kolmogorov- Smirnov statistic) D, and Kolmogorov not only then demonstrates that the difference between the EDF and F(x) can be made as small as we please as the sample size n becomes larger, but also gives a method for calculating the distribution of D at specified points, for finite n, and uses this to give the asymptotic distribution of D. The ideas in this paper have formed a platform for a vast literature, both of interesting and important probability problems, and also concerning methods of using the Kolmogorov statistic (and other statistics) for testing fit to a distribution. This literature continues with great strength today, after over 50 years, showing no signs of diminishing. It is evident that the ideas set in motion by Kolmogorov are of paramount importance in statistical analysis, and variations on the probabilistic problems, including modern methods of treating them, continue to hold attention.

M. A. Stephens
On the Empirical Determination of a Distribution Function

Contribution to the study of the possibility of determining a distribution function, knowing the results of a finite number of trials.

A. Kolmogorov
Introduction to Neyman (1934) On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection

Neyman was born in 1894 in Russia to Polish parents. An extensive account of his private and professional life is found in Reid (1982), which divides Neyman’s life into three periods.

T. Dalenius
On the Two Different Aspects of the Representative Method: the Method of Stratified Sampling and the Method of Purposive Selection

Owing to the work of the International Statistical Institute,* and perhaps still more to personal achievements of Professor A.L. Bowley, the theory and the possibility of practical applications of the representative method has attracted the attention of many statisticians in different countries. Very probably this popularity of the representative method is also partly due to the general crisis, to the scarcity of money and to the necessity of carrying out statistical investigations connected with social life in a somewhat hasty way. The results are wanted in some few months, sometimes in a few weeks after the beginning of the work, and there is neither time nor money for an exhaustive research.

Jerzy Neyman
Introduction to Hotelling (1936) Relations Between Two Sets of Variates

Let the random vector* x with mean 0 be partitioned into subvectors of s and t components, respectively, $$x = \left( {{x^{{{\left( 1 \right)}^\prime }}}} \right.,{\left. {{x^{{{\left( 2 \right)}^\prime }}}} \right)^\prime }$$, and let the covariance matrix matrix ℰ $$xx' = \sum $$ be partitioned conformally 1$$ Exx\prime = \left[ {\begin{array}{*{20}{c}} {E{x^{(1)}}{x^{(1)\prime }}\,E{x^{(2)}}{x^{(1)\prime }}} \\ {E{x^{(2)}}{x^{(1)\prime }}\,E{x^{(1)}}{x^{(1)\prime }}} \end{array}\,} \right] = \left[ {\begin{array}{*{20}{c}} {{\sum _{11}}\,{\sum _{12}}} \\ {{\sum _{21}}\,{\sum _{22}}} \end{array}} \right] = \sum . $$

T. W. Anderson
Relations Between Two Sets of Variates

Concepts of correlation and regression may be applied not only to ordinary one-dimensional variates but also to variates of two or more dimensions. Marksmen side by side firing simultaneous shots at targets, so that the deviations are in part due to independent individual errors and in part to common causes such as wind, provide a familiar introduction to the theory of correlation; but only the correlation of the horizontal components is ordinarily discussed, whereas the complex consisting of horizontal and vertical deviations may be even more interesting. The wind at two places may be compared, using both components of the velocity in each place. A fluctuating vector is thus matched at each moment with another fluctuating vector. The study of individual differences in mental and physical traits calls for a detailed study of the relations between sets of correlated variates. For example the scores on a number of mental tests may be compared with physical measurements on the same persons. The questions then arise of determining the number and nature of the independent relations of mind and body shown by these data to exist, and of extracting from the multiplicity of correlations in the system suitable characterizations of these independent relations. As another example, the inheritance of intelligence in rats might be studied by applying not one but s different mental tests to N mothers and to a daughter of each

Harold Hotelling
Introduction to Wilcoxon (1945) Individual Comparisons by Ranking Methods

When in 1945, Frank Wilcoxon published this unpretentious little paper, he could hardly have guessed that the two techniques he was proposing would soon occupy a central place in a newly developing branch of statistics that became known as nonparametrics. Wilcoxon (1892–1965) a physical chemist by training who was employed by the American Cyanamid Company in Stamford, Connecticut, came to statistics because of a need for analyzing laboratory data. His main motivation in developing the two new techniques seems to have been a desire to replace the endless

t

-statistics that he needed for the analysis of his laboratory measurements by something computationally simpler. In a subsequent publication [Wilcoxon (1949)], he explained

It is not always realized that there are available rapid approximate methods which are quite useful in interpreting the results of experiments, even though these approximate methods do not utilize fully the information contained in the data.

The two procedures, now generally known as the Wilcoxon rank sum test (or two-sample test) and the Wilcoxon signed rank test (or one-sample test), are used for the comparison of two treatments involving unpaired and paired sample observations, respectively. In classical normal theory statistics, two treatments are compared with the help of appropriate

t

-tests. Wilcoxon proposed replacing the actual data values by their ranks to simplify computational effort.

G. E. Noether
Individual Comparisons by Ranking Methods

The comparison of two treatments generally falls into one of the following two categories: (a) we may have a number of replications for each of the two treatments, which are unpaired, or (b) we may have a number of paired comparisons leading to a series of differences, some of which may be positive and some negative. The appropriate methods for testing the significance of the differences of the means in these two cases are described in most of the textbooks on statistical methods.

Frank Wilcoxon
Introduction to Mosteller (1946) On Some Useful “Inefficient” Statistics

The editors have not done the obvious in selecting this paper. Evidently, they were not put off by its opening words: “Several statistical techniques are proposed for economically analyzing large masses of data by means of punched-card equipment.” Moreover, very little of the paper survives unimproved in current statistical practice. Nevertheless, the author made a number of significant advances, pointed the way to many more, and showed great prescience. The paper, as a result, has been extremely influential and has stimulated much research on order statistics.

H. A. David
On Some Useful “Inefficient” Statistics

Several statistical techniques are proposed for economically analyzing large masses of data by means of punched-card equipment; most of these techniques require only a counting sorter. The methods proposed are designed especially for situations where data are inexpensive compared to the cost of analysis by means of statistically “efficient” or “most powerful” procedures. The principal technique is the use of functions of order statistics, which we call systematic statistics.It is demonstrated that certain order statistics are asymptotically jointly distributed according to the normal multivariate law.For large samples drawn from normally distributed variables we describe and give the efficiencies of rapid methods: i)for estimating the mean by using 1, 2, …, 10 suitably chosen order statistics;ii)for estimating the standard deviation by using 2, 4, or 8 suitably chosen order statistics;iii)for estimating the correlation coefficient whether other parameters of the normal bivariate distribution are known or not (three sorting and three counting operations are involved). The efficiencies of procedures ii) and iii) are compared with the efficiencies of other estimates which do not involve sums of squares or products.

Frederick Mosteller
Introduction to Durbin and Watson (1950, 1951) Testing for Serial Correlation in Least Squares Regression. I, II

Fitting a linear function of some variables (denoted by x1, …,x k ) to a variable denoted y by least squares is an old statistical technique. In the Fisherian revolution of the 1920s and 1930s, statistical methods were mainly applied in the natural sciences where one could often design the experiments that produced the data. The emphasis was largely on how the experiments should be designed to make the least-squares assumptions valid rather than on checking the correctness of these assumptions.

Maxwell L. King
Testing for Serial Correlation in Least Squares Regression. I

A great deal of use has undoubtedly been made of least squares regression methods in circumstances in which they are known to be inapplicable. In particular, they have often been employed for the analysis of time series and similar data in which successive observations are serially correlated. The resulting complications are well known and have recently been studied from the standpoint of the econometrician by Cochrane & Orcutt (1949). A basic assumption underlying the application of the least squares method is that the error terms in the regression model are independent. When this assumption—among others—is satisfied the procedure is valid whether or not the observations themselves are serially correlated. The problem of testing the errors for independence forms the subject of this paper and its successor. The present paper deals mainly with the theory on which the test is based, while the second paper describes the test procedures in detail and gives tables of bounds to the significance points of the test criterion adopted. We shall not be concerned in either paper with the question of what should be done if the test gives an unfavourable result.

J. Durbin, G. S. Watson
Testing for Serial Correlation in Least Squares Regression. II

In an earlier paper (Durbin & Watson, 1950) the authors investigated the problem of testing the error terms of a regression model for serial correlation. Test criteria were put forward, their moments calculated, and bounds to their distribution functions were obtained. In the present paper these bounds are tabulated and their use in practice is described. For cases in which the bounds do not settle the question of significance an approximate method is suggested. Expressions are given for the mean and variance of a test statistic for one- and two-way classifications and polynomial trends, leading to approximate tests for these cases. The procedures described should be capable of application by the practical worker without reference to the earlier paper (hereinafter referred to as Part I).

J. Durbin, G. S. Watson
Introduction to Box and Wilson (1951) On the Experimental Attainment of Optimum Conditions

Imperial Chemical Industries (ICI) was one of the first companies to recognize the value of statisticians and they had several, spread over the various divisions. ICI Dyestuff’s Division, in particular, had the good fortune and foresight to have George E.P. Box as its statistical head and K.B. Wilson as one of its chemists. Their cooperation in this particular study set the foundation for the entire present field of response surface methodology.

Norman R. Draper
On the Experimental Attainment of Optimum Conditions

The work described is the result of a study extending over the past few years by a chemist and a statistician. Development has come about mainly in answer to problems of determining optimum conditions in chemical investigations, but we believe that the methods will be of value in other fields where experimentation is sequential and the error fairly small.

G. E. P. Box, K. B. Wilson
Introduction to Kaplan and Meier (1958) Nonparametric Estimation from Incomplete Observations

During the late 1940s and early 1950s, stimulated in part by Sir Bradford Hill’s (1951) advocacy of the randomized clinical trial as a cornerstone of modern scientific medicine, medical research workers began to undertake numerous follow-up studies of patient populations in order to assess the effectiveness of medical treatment. A challenging problem in the analysis of data from such investigations, whose typical endpoint was the death of a patient, stemmed from the fortunate fact that not all the patients had died at the end of the study. Thus, some of the observed lifetimes were censored on the right, meaning simply that they were known only to exceed an observation limit equal to the time interval from the patient’s entry on study to its close or, if follow-up was incomplete, to the time of loss from further observation. This precluded the use of classical statistical procedures based on averages.

N. E. Breslow
Nonparametric Estimation from Incomplete Observations

In lifetesting, medical follow-up, and other fields the observation of the time of occurrence of the event of interest (called a death) may be prevented for some of the items of the sample by the previous occurrence of some other event (called a loss). Losses may be either accidental or controlled, the latter resulting from a decision to terminate certain observations. In either case it is usually assumed in this paper that the lifetime (age at death) is independent of the potential loss time; in practice this assumption deserves careful scrutiny. Despite the resulting incompleteness of the data, it is desired to estimate the proportion P(t) of items in the population whose lifetimes would exceed t (in the absence of such losses), without making any assumption about the form of the function P(t). The observation for each item of a suitable initial event, marking the beginning of its lifetime, is presupposed.For random samples of size N the product-limit (PL) estimate can be defined as follows: List and label the N observed lifetimes (whether to death or loss) in order of increasing magnitude, so that one has $$0 \leqslant t_1^\prime \leqslant t_2^\prime \leqslant \cdots \leqslant t_N^\prime .$$ Then $$\hat P\left( t \right) = \Pi r\left[ {\left( {N - r} \right)/\left( {N - r + 1} \right)} \right]$$, where r assumes those values for which $$t_r^\prime \leqslant t$$ and for which $$t_r^\prime$$ measures the time to death. This estimate is the distribution, unrestricted as to form, which maximizes the likelihood of the observations.Other estimates that are discussed are the actuarial estimates (which are also products, but with the number of factors usually reduced by grouping); and reduced-sample (RS) estimates, which require that losses not be accidental, so that the limits of observation (potential loss times) are known even for those items whose deaths are observed. When no losses occur at ages less than t the estimate of P(t) in all cases reduces to the usual binomial estimate, namely, the observed proportion of survivors.

E. L. Kaplan, Paul Meier
Introduction to Chernoff (1959) Sequential Design of Experiments

Herman Chernoff was born in New York City on July 1, 1923. His parents immigrated from Russia. Herman graduated from Townsend Harris High School in New York City in 1939 and entered the New York University to study mathematics (with minor in physics). He received a B.S. in 1943 and went to Brown University for graduate studies in applied mathematics. With a brief interruption (as a result of the draft in 1945), he completed his non- thesis requirement at Brown and came to Columbia University, in January 1947, to work on his dissertation under the supervision of Abraham Wald. He was awarded a Ph.D. from Brown University in 1948. Herman served on the faculty at the University of Illinois, Stanford University, and MIT; currently, he is at Harvard University. He is a member of the National Academy of Sciences, Fellow (and past president) of the Institute of Mathematical Statistics, Fellow of the American Statistical Association, elected member of the International Statistical Institute, and a member of a number of other professional socities.

P. K. Sen
Sequential Design of Experiments

Considerable scientific research is characterized as follows. The scientist is interested in studying a phenomenon. At first he is quite ignorant and his initial experiments are preliminary and tentative. As he gathers relevant data, he becomes more definite in his impression of the underlying theory. This more definite impression is used to construct more informative experiments. Finally after a certain point he is satisfied that his evidence is sufficient to allow him to announce certain conclusions and he does so.

Herman Chernoff
Introduction to Box and Jenkins (1962) Some Statistical Aspects of Adaptive Optimization and Control

This paper is a long paper full of insights and innovations, many of which have become key elements in modern time series analysis. The paper is a discussion paper and many of the discussants seem to have recognized it as a landmark paper, but perhaps not for the reasons modern statisticians would find it important. While ostensibly on optimization and control, it contains many gems that are precursors to modern statistical and data analytic notions. Indeed, the not unrelated papers by Kalman (1960) and Kalman and Bucy (1961) have proven to have more direct impact on control and linear filtering theory. Nonetheless, this paper surely ranks among the most important statistics papers in the modern era. It represents the first in a series of collaborations between Box and Jenkins that came to full fruition in 1970 with the publication of the now classic text on time domain parametric time series models including autoregressive, moving average, and mixed models. Preliminary discussions of these models appear in this paper. In addition, a description of technical feedback is given that anticipates modern exploratory data analysis. The treatment of nonstationarity using difference operators is also noted, as well as the suggestion of modern so-called high-resolution spectral analysis.

Edward J. Wegman
Some Statistical Aspects of Adaptive Optimization and Control

It is often necessary to adjust some variable X such as the concentration of consecutive batches of a product, to keep X close to a specified target value. A second more complicated problem occurs when the independent variables X in a response function η(X) are to be adjusted so that the derivatives ∂η/∂Xare kept close to a target value zero, thus maximizing or minimizing the response. These are shown to be problems of prediction, essentially, and the paper is devoted mainly to the estimation from past data of the “best” adjustments to be applied in the first problem.

G. E. P. Box, G. M. Jenkins
Introduction to Tukey (1962) The Future of Data Analysis

Among articles from the Annals of Mathematical Statistics (AMS), “The Future of Data Analysis” is unusual in many ways.

Lyle V. Jones
The Future of Data Analysis

For a long time I have thought I was a statistician, interested in inferences from the particular to the general. But as I have watched mathematical statistics evolve, I have had cause to wonder and to doubt. And when I have pondered about why such techniques as the spectrum analysis of time series have proved so useful, it has become clear that their “dealing with fluctuations” aspects are, in many circumstances, of lesser importance than the aspects that would already have been required to deal effectively with the simpler case of very extensive data, where fluctuations would no longer be a problem. All in all, I have come to feel that my central interest is in data analysis, which I take to include, among other things: procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data.

John W. Tukey
Introduction to Birch (1963) Maximum Likelihood in Three-Way Contingency Tables

During the 1960s, there was an extremely rapid expansion of the literature on categorical data analysis, especially related to the use of loglinear models for multidimensional contingency tables, culminating in a series of papers by Goodman [e.g., see those collected in Goodman (1978)] and the monographs by Bishop et al. (1975) and Haberman (1974). The problems addressed by this literature paralleled for discrete/categorical data those addressed by multiple regression, analysis of variance, and the general linear model for continuous data. Loglinear model methods have had a major impact on the current statistical methodology in use today for the analysis of categorical data, and a crucial paper by Martin Birch that appeared in 1963 provides the underpinnings of the general statistical theory involved.

Stephen E. Fienberg
Maximum Likelihood in Three-Way Contingency Tables

Interactions in three-way and many-way contingency tables arc defined as certain linear combinations of the logarithms of the expected frequencies. Maximum-likelihood estimation is discussed for many-way tables and the solutions given for three-way tables in the cases of greatest interest.

Martin W. Birch
Introduction to Huber (1964) Robust Estimation of a Location Parameter

Huber’s first paper on robust statistics is outstanding in several respects. It contains the first encompassing mathematical definitions of the “approximate validity of a parametric model” and thus became the founding paper of the “stability theory of statistical procedures” that by a historical accident was called “robust statistics.” Since parametric models, like the model of normality. are almost always at best only approximately valid, the paper for the first time fills a conceptual gap left open in Fisher’s theory of exact parametric models [Fisher (1922), see in particular p. 314]. The practical importance of this gap was convincingly demonstrated by E.S. Pearson (1931) and others for tests and by Tukey (1960) for estimation. But even apart from the practical aspect, Huber’s paper is important for the logical chain of statistical reasoning.

Frank R. Hampel
Robust Estimation of a Location Parameter

This paper contains a new approach toward a theory of robust estimation; it treats in detail the asymptotic theory of estimating a location parameter for contaminated normal distributions, and exhibits estimators—intermediaries between sample mean and sample median—that are asymptotically most robust (in a sense to be specified) among all translation invariant estimators. For the general background, see Tukey (1960) (p. 448 ff.)

Peter J. Huber
Introduction to Cox (1972) Regression Models and Life-Tables

In this paper, Sir David Cox proposed a stimulating and pioneering procedure for the regression analysis of censored failure time data. Within a few years of publication, this procedure became a data analytic standard in a number of application areas, most notably in the biomedical sciences. The procedure has also stimulated considerable related methodologic development.

Ross L. Prentice
Regression Models and Life-Tables

The analysis of censored failure times is considered. It is assumed that on each individual arc available values of one or more explanatory variables. The hazard function (age-specific failure rate) is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time. A conditional likelihood is obtained, leading to inferences about the unknown regression coefficients. Some generalizations are outlined.

David R. Cox
Introduction to Nelder and Wedderburn (1972) Generalized Linear Models

In this paper, the authors show that maximum likelihood estimates for a large class of commonly used regression models can be obtained by the method of iteratively weighted least squares, in which both the weights and the response arc adjusted from one iteration to the next. The proposed algorithm, sometimes known as “Fisher-scoring,” is an extension of Fisher’s (1935) method for computing maximum likelihood estimates in linear probit models. The same result was obtained independently by Bradley (1973) and Jennrich and Moore (1975), though not exploited to its full extent.

P. McCullagh
Generalized Linear Models

The technique of iterative weighted linear regression can be used to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation. A generalization of the analysis of variance is given for these models using log- likelihoods. These generalized linear models are illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables) and gamma (variance components).The implications of the approach in designing statistics courses are discussed.

J. A. Nelder, R. W. M. Wedderburn
Introduction to Efron (1979) Bootstrap Methods: Another Look at the Jackknife

It is not unusual, in the history of statistics, that an important paper goes scarcely noticed for a decade or longer. Examples from the past half-century include von Mises’ (1937,1947) papers on statistical functionals, Quenouille’s (1949) paper, Tukey’s (1956) abstract on the jackknife, and Wald’s (1943) paper on the asymptotic optimality of likelihood ratio tests. Each of these pioneering works was well ahead of its time. Brad Efron’s (1979) paper on the bootstrap sparked immediate interest among his peers. A decade after its publication, the bootstrap literature is large and still growing, with no immediate end in sight. Surely, the timing and formulation of Efron’s paper were just right. But what were the yearnings in the statistical world of 1979 that the paper touched so well? Why did development of the bootstrap idea follow so swiftly?

Rudolf J. Beran
Bootstrap Methods: Another Look at the Jackknife

We discuss the following problem given a random sample X = (X1, X2,…, Xn) from an unknown probability distribution F, estimate the sampling distribution of some prespecified random variable R(X, F), on the basis of the observed data x. (Standard jackknife theory gives an approximate mean and variance in the case R(X, F) = $$\theta \left( {\hat F} \right) - \theta \left( F \right)$$, θ some parameter of interest.) A general method, called the “bootstrap”, is introduced, and shown to work satisfactorily on a variety of estimation problems. The jackknife is shown to be a linear approximation method for the bootstrap. The exposition proceeds by a series of examples: variance of the sample median, error rates in a linear discriminant analysis, ratio estimation, estimating regression parameters, etc.

Bradley Efron
Backmatter
Metadaten
Titel
Breakthroughs in Statistics
herausgegeben von
Samuel Kotz
Norman L. Johnson
Copyright-Jahr
1992
Verlag
Springer New York
Electronic ISBN
978-1-4612-4380-9
Print ISBN
978-0-387-94039-7
DOI
https://doi.org/10.1007/978-1-4612-4380-9