Skip to main content
main-content

Über dieses Buch

It is an honor to be asked to write a foreword to this book, for I believe that it and other books to follow will eventually lead to a dramatic change in the current statistics curriculum in our universities. I spent the 1975-76 academic year at Florida State University in Tallahassee. My purpose was to complete a book on Statistical Reliability Theory with Frank Proschan. At the time, I was working on total time on test processes. At the same time, I started attending lectures by Dev Basu on statistical inference. It was Lehmann's hypothesis testing course and Lehmann's book was the text. However, I noticed something strange - Basu never opened the book. He was obviously not following it. Instead, he was giving a very elegant, measure theoretic treatment of the concepts of sufficiency, ancillarity, and invariance. He was interested in the concept of information - what it meant. - how it fitted in with contemporary statistics. As he looked at the fundamental ideas, the logic behind their use seemed to evaporate. I was shocked. I didn't like priors. I didn't like Bayesian statistics. But after the smoke had cleared, that was all that was left. Basu loves counterexamples. He is like an art critic in the field of statistical inference. He would find a counterexample to the Bayesian approach if he could. So far, he has failed in this respect.

Inhaltsverzeichnis

Frontmatter

Information and Likelihood

Frontmatter

Chapter I. Recovery of Ancillary Information

Abstract
The main upsurge of the late Professor R.A. Fisher’s theory of Statistical Inference took place within a brief span of about 10 years (1920–30). It was during this period that Fisher came out with the notions of likelihood, fiducial probability, information and intrinsic accuracy, sufficiency, ancillary statistics and recovery of information — concepts around which the superstructure of the theory was built.
J. K. Ghosh

Chapter II. Statistical Information and Likelihood

Part I : Principles
Abstract
The key word in Statistics is information. After all, this is what the subject is all about. A problem in statistics begins with a state of nature, a parameter of interest ω , about which we do not have enough information. In order to generate further information about ω , we plan and then perform a statistical experiment & . This generates the sample x. By the term ‘statistical data’ we mean such a pair ( & , x) where & is a well defined statistical experiment and x the sample generated by a performance of the experiment. The problem of data analysis is to extract ‘the whole of the relevant information’ — an expression made famous by R.A. Fisher contained in the data ( &, x)about the parameter w . But, what is information? No other concept in statistics is more elusive in its meaning and less amenable to a generally agreed definition.
J. K. Ghosh

Chapter III. Statistical Information and Likelihood

Part II : Methods
Abstract
In Part I our main concern was with the notion of statistical information in the data, and with some general principles of data analysis. Now we turn our attention from principles to a few methods of data analysis. By a non-Bayesian likelihood method we mean any method of data analysis that neither violates ℒ-the likelihood principle — nor explicitly incorporates into its inference making process any prior information (that the experimenter may have about the parameter ω) in the form of a prior probability distribution over the parameter space Ω . The origin of most of such methods may be traced back to the writings of R.A. Fisher. In this section we list several such methods. To fix our ideas, let us suppose that Ω is either discrete or an interval subset of the real line. In the latter case, we shall also suppose that the likelihood function L(ω) is a smooth function and has a single mode (whenever such an assumption is implicit in the method).
J. K. Ghosh

Chapter IV. Statistical Information and Likelihood

Part III: Paradoxes
Abstract
I vividly recall an occasion in late 1955 when Sir Ronald (then visiting the Indian Statistical Institute, Calcutta and giving a series of seminars based on the manuscript of his forthcoming book) got carried away by his own enthusiasm for fiducial probability and tried to put the fiducial argument in the classical form of the Aristotelian syllogism known as Barbara : ‘A is B, C is A, therefore C is B’. The context was : A random variable X is known to he normally distributed with unit variance and unknown mean θ about which the only information that we have is, − ∞ < θ < ∞. The variable X is observed and the observation is 5. Sir Ronald declared that the following constitutes a ‘proof’:
  • Major premise : Probability that the variable X exceeds θ is 1/2.
  • Minor premise : The variable X is observed and the observation is 5.
  • Conclusion : Probability that 5 exceeds θ is 1/2.
J. K. Ghosh

Chapter V. Statistical Information and Likelihood Discussions

Without Abstract
J. K. Ghosh

Chapter VI. Partial Sufficiency

Abstract
In the beginning we have a parameter of interest — an unknown state of nature θ. With a view to gaining additional information on θ , we plan and then perform a statistical experiment & and thus generate the sample x. The problem of data analysis is to extract all the relevant information in the data (&,x) about the parameter of interest θ.
J. K. Ghosh

Chapter VII. Elimination of Nuisance Parameters

Abstract
The problem begins with an unknown state of nature represented by the parameter of interest θ . We have some information about θ to begin with — e.g., we know that θ is a member of some well-defined parameter space θ- but we are seeking more. Toward this end, a statistical experiment & is planned and performed and this generates the sample observation x. Further information about θ is then obtained by a careful analysis of the data ( &, x) in the light of all our prior information about θ and in the context of the particular inference problem related to θ . For going through the rituals of the traditional sample-space analysis of data, we must begin with the invocation of a trinity of abstractions ( X, A, P ), where X is the sample space, A is a σ-algebra of events (subsets of X ), and P is a family of probability measures on A . If the model (X, A, P) is such that we can represent the family P as {Pθ: θ εθ}, where the correspondence θ → Pθ is one-one and (preferably) smooth, then we go about analyzing the data according to our own light and are thankful for not having to contend with any nuisance parameters.
J. K. Ghosh

Chapter VIII. Sufficiency and Invariance

Summary
Let (X, A, P)be a given statistical model and letg be the class of all one-to-one, bimeasurable maps g of ( X, A) onto itself such that g is measure-preserving for each P ε P, i.e. Pg−1= P for all P. Let us suppose that there exists a least (minimal) sufficient sub-fieldL. Then, for each LεL, it is true that g−1L is P-equivalent to L for each gεg , i.e., the least sufficient sub-field is almost g -invariant. It is demonstrated that, in many familiar statistical models, the least sufficient sub-field and the sub-field of all almost g -invariant sets are indeed P-equivalent. The problem of data reduction in the presence of nuisance parameters has been discussed very briefly. It is shown that in many situations the principle of invariance is strong enough to lead us to the standard reductions. For instance, given n independent observations on a normal variable with unknown mean (the nuisance parameter) and unknown variance, it is shown how the principle of invariance alone can reduce the data to the sample variance.
J. K. Ghosh

Chapter IX. Ancillary Statistics, Pivotal Quantities and Confidence Statements

Abstract
The most commonly used expression in Statistics is information; yet, we have no agreement on the definition or usage of this concept. However, in the particular situation where the problem is to predict a future value of a random variable X with a known probability distribution p(.), we all seem to agree that the information on the yet unobserved future value of X may be characterized by the function p(.) itself. And if we have another variable Y such that the conditional distribution p(.|Y) of X, given Y, is also known then, having observed Y, we can claim that the information on X has shifted from p(.) to p(.| Y). [To avoid a multiplicity of notations, we do not distinguish between a random variable X, an observed value of X and a typical point in the sample space of X.] If p(.|Y) is the same for all values of Y, then X is stochastically independent of Y. In this case Y is said to have no information on X. And we know how to prove then that X has no information on Y.
J. K. Ghosh

Survey Sampling and Randomization

Frontmatter

Chapter X. Sufficiency in Survey Sampling

Abstract
In the present context the term ‘statistical structure’ denotes a triplet (X, A, P) where
(i)
X = {x} is the sample space — the set of all the ‘possible outcomes’ of the statistical (sampling) experiment,
 
(ii)
A={A} is a Borel field of subsets of X — the class of ‘events’ that are recognized as ‘measurable’, and
 
(iii)
P= {Pθ} is a family of probability measures (on A) indexed by the ‘parameter’ θ.
 
J. K. Ghosh

Chapter XI. Likelihood Principle and Survey Sampling

Abstract
This article was written with the object of emphasizing the following four points.
J. K. Ghosh

Chapter XII. On the Logical Foundations of Survey Sampling

Abstract
It is a mathematical necessity that we idealize the real state of affairs and come up with a set of concepts that are simple enough to be incorporated in a mathematical theory.
J. K. Ghosh

Chapter XIII. On the Logical Foundations of Survey Sampling Discussions

Without Abstract
J. K. Ghosh

ChapterXIV. Relevance of Randomization in Data Analysis

Abstract
This essay is a natural sequel to an earlier one (Basu, 1971) presented at a symposium held in Waterloo, Ontario in March/April 1970. The writing of this essay was promised and its content foreshadowed in the Waterloo essay. While presenting that essay, I made a number of off-the-cuff remarks challenging the logic of the so-called randomization analysis of data. Here I propose to give a further account of my views on the question of data analysis. The time lag of over seven years between the two essays is only a measure of my diffidence on the important question of the relevance of randomization at the data analysis stage.
J. K. Ghosh

Chapter XV. The Fisher Randomization Test

Abstract
Randomization is widely recognized as a basic principle of statistical experimentation. Yet we find no satisfactory answer to the question, Why randomize ? In a previous paper (Basu 1978b)the question was examined from the point of view of survey statistics. In this article we take an uninhibited frontal view of a part of the randomization methodology generally known as the Fisher randomization test.
J. K. Ghosh

Chapter XVI. The Fisher Randomization Test: Discussions

Abstract
Chapter XVI was published in the Journal of the American Statistical Association in 1980 with discussions. The discussants were David V. Hinkley, Oscar Kempthorne, David A. Lane, D.V. Lindley, and Donald B. Rubin. These discussions and Basu’s rejoinder are put together in this Chapter.
J. K. Ghosh

Miscellaneous Notes and Discussions

Frontmatter

Chapter XVII. Likelihood and Partial Likelihood

Abstract
During the fifty years (1912–1962) that R.A. Fisher dominated the field of statistical research, he came out with many innovative ideas like likelihood, sufficiency, ancillarity, asymptotic efficiency, information and intrinsic accuracy, pivotal quantities and fiducial distribution, conditionality argument and recovery of ancillary information, analysis of variance and covariance, randomization analysis of experimental data, etc. Of these new concepts, likelihood is certainly the first and the foremost and perhaps the only one that is likely to endure the severe test of time. Even though likelihood plays a central role in current statistical theory, a great deal of controversy and confusion surround the usage of the notion. This article, a natural sequel to an earlier long essay (Basu, 1973) on the subject, is an elaboration of some aspects of the controversy.
J. K. Ghosh

Chapter XVIII. A Discussion on the Fisher Exact Test

Abstract
I begin with a note of dissent. The Fisher-Yates conditional test for the 2 × 2 categorical data is called “exact” not because the test is “based on the theories of R.A. Fisher” but because the computation of the attained level of significance (the P-value) requires no mathematical approximation beyond what is already involved in the choice of the statistical model. The normal test is inexact because the null distribution of the test statistic TN is N(0,1) only as an approximation.
J. K. Ghosh

Chapter XIX. A Discussion on Survey Theory

Abstract
I did not intend to participate in today’s discussion on fundamental questions. But, Mr. Chairman, you have put me on the spot with the request that I throw some light on some of the hotly debated issues. If I succeed only in further muddying the water of clear thinking, please remember that you asked for it.
J. K. Ghosh

Chapter XX. A Note on Unbiased Estimation

Summary
It is shown that even in very simple situations (like estimating the mean of a normal population) where a uniformly minimum variance unbiased estimator of the unknown population characteristic is known to exist, no best (even locally) unbiassed estimator exists as soon as we alter slightly the definition of variance.
J. K. Ghosh

Chapter XXI. The Concept of Asymptotic Efficiency

Summary
Partly of an expository nature this note brings out the fact that an estimator, though asymptotically much less efficient (in the classical sense) than another, may yet have much greater probability concentration (as defined in this article) than the latter.
J. K. Ghosh

Chapter XXII. Statistics Independent of a Complete Sufficient Statistic

Abstract
If {Pθ}, θ ε Ω, be a family of probability measures on an abstract sample space S and T be a sufficient statistic for θ then for a statistic T1 to be stochastically independent of T it is necessary that the probability distribution of T1 be independent of θ. The condition is also sufficient if T be a boundedly complete sufficient statistic. Certain well-known results of distribution theory follow immediately from the above considerations. For instance, if x1, x2,..., xn, are independent N(μ, σ)’s then the sample mean x and the sample variance s2 are mutually independent and are jointly independent of any statistic f (real or vector valued) that is independent of change of scale and origin. It is also deduced that if x1, x2,..., xn are independent random variables such that their joint distribution involves an unknown location parameter θ then there can exist a linear boundedly complete sufficient statistic for θ only if the x’s are all normal.Similar characterizations for the Gamma distribution also are indicated.
J. K. Ghosh

Chapter XXIII. Statistics Independent of a Sufficient Statistic

Abstract
Let X be a random variable (sample) taking values in an arbitrary sample space X with the associated σ-field of measurable sets A and the family of probability measures {Pθ}, θ ε Ω. By a statistic T = T(X) we mean a measurable characteristic of the sample X, i.e., T is an A-B measurable transformation of the measurable space (X,A) into some measurable space(J,B). The family of induced (by the mapping T) probability measures on J is denoted by {PθT-1},θ ε Ω.
J. K. Ghosh

Chapter XXIV. The Basu Theorems

Abstract
The theorems are related to the notions of sufficiency, ancillarity and conditional independence. Let X denote the sample and θ the parameter that completely specifies the sampling distribution Pθ of X. An event E is ancillary if Pθ (E) is θ-free, i.e., Pθ(E) = Pθ’(E) for all θ’ ε θ, the parameter space. A statistic Y = Y(X) is ancillary if every Y-event (i.e., a measurable set defined in terms of Y) is ancillary (see ANCILLARY STATISTICS). A statistic T is sufficient if, for every event E, there exists a θ-free version of the conditional probability function Pθ (E |T) (see SUFFICIENCY). The event E is (conditionally) independent of T if, for each θ ε θ, the conditional probability function Pθ(E |T) is Pθ -essentially equal to the constant Pθ (E). The statistic Y is independent of T if every Y-event is inde-pendent of T. (Independence is a symmetric relationship between two statistics.)
J. K. Ghosh

Backmatter

Weitere Informationen