Skip to main content
Top

1997 | Book

A Modern Approach to Probability Theory

Authors: Bert Fristedt, Lawrence Gray

Publisher: Birkhäuser Boston

Book Series : Probability and its Applications

insite
SEARCH

About this book

Overview This book is intended as a textbook in probability for graduate students in math­ ematics and related areas such as statistics, economics, physics, and operations research. Probability theory is a 'difficult' but productive marriage of mathemat­ ical abstraction and everyday intuition, and we have attempted to exhibit this fact. Thus we may appear at times to be obsessively careful in our presentation of the material, but our experience has shown that many students find them­ selves quite handicapped because they have never properly come to grips with the subtleties of the definitions and mathematical structures that form the foun­ dation of the field. Also, students may find many of the examples and problems to be computationally challenging, but it is our belief that one of the fascinat­ ing aspects of prob ability theory is its ability to say something concrete about the world around us, and we have done our best to coax the student into doing explicit calculations, often in the context of apparently elementary models. The practical applications of probability theory to various scientific fields are far-reaching, and a specialized treatment would be required to do justice to the interrelations between prob ability and any one of these areas. However, to give the reader a taste of the possibilities, we have included some examples, particularly from the field of statistics, such as order statistics, Dirichlet distri­ butions, and minimum variance unbiased estimation.

Table of Contents

Frontmatter

Probability Spaces, Random Variables, and Expectations

Frontmatter
Chapter 1. Probability Spaces

In modern probability theory, a fundamental building block is the “probability space”, a concept that is to be precisely defined in the latter portion of this chapter. We begin the chapter informally by giving some concrete examples of probability spaces. In particular, we model the experiment of tossing a fair coin infinitely many times.

Bert Fristedt, Lawrence Gray
Chapter 2. Random Variables

This chapter treats certain functions having as their domain a probability space. Such functions, known as ‘random variables’, have the property that they transform one probability space into another. In applications, random variables often represent what is actually observed in an experiment. Thus, in specific examples, it may be more descriptive to call them by such names as ‘random numbers’, ‘random sequences of heads and tails of a coin’, and ‘random chords of a circle’.

Bert Fristedt, Lawrence Gray
Chapter 3. Distribution Functions

The main purpose of this chapter is to classify all probability measures on the measurable space (ℝ, ??). We will accomplish this task by establishing a one-to-one correspondence between such probability measures and a certain class of functions, known as ‘distribution functions’. Many important probability measures and their corresponding distribution functions will be identified, including the binomial, normal, Poisson, gamma, and beta families of probability measures.

Bert Fristedt, Lawrence Gray
Chapter 4. Expectations: Theory

The ‘expectation’ of an ℝ-valued random variable is a weighted average of the values taken by that random variable. It is a useful tool for the description and analysis of random variables and their distributions. Properties of expectations treated in this chapter include linearity and an important convergence theorem. The calculation of expectations is facilitated by establishing a connection with Riemann-Stieltjes integration.

Bert Fristedt, Lawrence Gray
Chapter 5. Expectations: Applications

Expectations are amazingly useful in the study of random variables and their distributions. Some of the reasons for this statement are contained in this chapter. In the first section, we introduce the ‘variance’ of an ℝ-valued random variable. Variance is used to obtain one version of the Law of Large Numbers, also known informally as the Law of Averages. The ‘covariance’ of two random variables is also presented. In Section 2, variance and covariance are defined for ℝd-valued random variables, and Section 3 concerns the expectations of various functions of ℝ-valued random variables. The chapter concludes with a discussion of ‘probability generating functions’, used in the study of distributions on Λ+. Several useful inequalities, including those of Chebyshev, Cauchy-Schwarz, and Jensen, are scattered throughout.

Bert Fristedt, Lawrence Gray
Chapter 6. Calculating Probabilities and Measures

In this chapter, after looking at several ways in which an event A can be defined in terms of other events A1,A2 ,…, we will develop methods for calculating P(A) in terms of the quantities P(A1), P(A2),… These methods include the Kochen-Stone and Borel-Cantelli Lemmas, the inclusion-exclusion formula, and some convergence theorems. Also, in the last section of this chapter, we will discuss measures other than probability measures.

Bert Fristedt, Lawrence Gray
Chapter 7. Measure Theory: Existence and Uniqueness

In some of the examples of previous chapters, most notably the coin-flip space, we defined a sample space Ω and a σ-field ℱ, but we did not completely specify probabilities P(A) for all A ∈ ℱ. Instead, we only gave the values of P(A) for events A in a smaller collection ε such that ℱ = σ(ε), and then we assumed without proof that P could be extended in a unique way to all of ℱ. In this chapter, we close this gap by showing that under certain natural assumptions, a function P defined on a collection ε of subsets of a sample space Ω can be extended in a unique way to a probability measure on ℱ = σ(ε). Once probability measures are constructed, we can piece them together to form σ-finite measures. Interesting examples include Lebesgue measure in ℝd and a certain measure on the space of all lines in ℝ2, both of which are invariant under rigid motions. In preparation for this chapter, the reader may want to review (Counter) example 6 in Chapter 1 and to reread the paragraphs leading up to that example and to Definition 5 of the same chapter.

Bert Fristedt, Lawrence Gray
Chapter 8. Integration Theory

In this chapter we have three main goals. The first is to extend the concept of expectation to general measure spaces. In this context, we do not use the phrase “expectation of a random variable”, but instead we introduce the ‘integral’ of a measurable function. This new kind of integral is called the ‘Lebesgue integral’. Our second goal is to introduce several tools, which along with the Monotone Convergence Theorem, are valuable for interchanging limit operations with expectation and integration. Our third goal is to explore some of the similarities and differences between Lebesgue integration and Riemann integration. As in the case of expectations, the Riemann-Stieltjes integral will be useful for computations of Lebesgue integrals. Also useful in such calculations is the ‘Radon-Nikodym derivative’, introduced near the end of the chapter.

Bert Fristedt, Lawrence Gray

Independence and Sums

Frontmatter
Chapter 9. Stochastic Independence

The first six sections of this chapter describe the measure-theoretic foundation for ‘stochastic independence’: products of probability spaces. After giving the basic definitions, we prove the existence of ‘product measure’ and also give an important result concerning integration with respect to product measure (the Fubini Theorem). Important relations among expectations, independence, and densities are described. The last three sections of the chapter do not depend on each other. The first treats the asymptotic behavior of sequences of independent identically distributed random variables. The second concerns ‘order statistics’ of finite sequences of such random variables. The last introduces some new distributions.

Bert Fristedt, Lawrence Gray
Chapter 10. Sums of Independent Random Variables

In Example 3 of Chapter 1 we calculated the probability that exactly k heads appear in n flips of a fair coin. In view of the construction of that example and the definition of independence given in the preceding chapter, we see that what we calculated is the distribution of the sum of n independent random variables, each of which has the Bernoulli distribution with parameter p = 1/2. Sums of independent random variables constitute a major theme in probability theory.

Bert Fristedt, Lawrence Gray
Chapter 11. Random Walk

In this chapter, we will study certain sequences of random variables, known as ‘random walks’. These are defined in terms of sums of independent identically distributed random variables. Important in the study of random walks (and of more general random sequences) are ‘filtrations’ and ‘stopping times’. A filtration is a sequence of σ-fields representing the information available at various stages of an experiment. A stopping time is a ℤ̄+ -valued random variable whose value may be regarded as the time at which an experiment is to be terminated. In applications, such as gambling theory, important stopping times are the time at which a random walk reaches a certain goal and the time at which it returns to its original position. These will be treated in the latter part of the chapter for several special random walks.

Bert Fristedt, Lawrence Gray
Chapter 12. Theorems of A.S. Convergence

In this chapter we study the convergence of certain sequences defined in terms of sums of independent random variables. We will chiefly be interested in almost sure convergence, but it will be useful to also consider another weaker type of convergence, called ‘convergence in probability’. The two major results concerning almost sure convergence are the Strong Law of Large Numbers and the Kolmogorov Three-Series Theorem. Other results included in this chapter are three important tools: the Kolmogorov 0–1 Law, the Hewitt-Savage 0–1 Law, and an important inequality, known as the Etemadi Lemma. As an application of the ideas contained in the proof of the Strong Law of Large Numbers, we also determine the asymptotic behavior of the size of the image of a random walk.

Bert Fristedt, Lawrence Gray
Chapter 13. Characteristic Functions

‘Characteristic functions’ and ‘moment generating functions’ correspond to distributions on ℝ and ℝ̄+, respectively, in a manner analogous to the correspondence between probability generating functions and distributions on ℝ̄+. It will be seen here and in succeeding chapters that these tools are quite powerful, particularly when independence is involved. After completing our coverage of the theory for the real line, we generalize characteristic functions to ℝd and discuss normal distributions in that setting. At the end of the chapter, we apply the 1-dimensional theory to random walk on ℤ.

Bert Fristedt, Lawrence Gray

Convergence in Distribution

Frontmatter
Chapter 14. Convergence in Distribution on the Real Line

In this chapter, we introduce a concept of convergence for sequences of distributions on ℝ and ℝ̄. This ‘convergence in distribution’ gives us a rigorous way to express the idea that two distributions are close to each other. For instance, we show in an example that a Poisson distribution can be approximated arbitrarily closely by binomial distributions. An important result, the Continuity Theorem, gives a criterion for the convergence of a sequence of distributions in terms of the corresponding sequence of characteristic functions. There are several other useful criteria as well, which are collected together in a result known as the Portmanteau Theorem. Although the most important applications of convergence in distribution will be found in later chapters, some are included here, including an introduction to the theory of ‘extreme values’, a discussion of the effects that ‘scaling’ and ‘centering’ have on sequences of distributions, and characterizations of moment generating functions and characteristic functions.

Bert Fristedt, Lawrence Gray
Chapter 15. Distributional Limit Theorems for Partial Sums

In this chapter we study convergence in distribution in settings involving sequences (S n : n = 1, 2,...), where for each n, S n = X1+... + X n is the nth partial sum of a series of independent random variables. Our first result is that convergence in distribution of (S n ) is equivalent to a.s. convergence. Thereafter, we specialize to the case in which (X1, X2,...) is an iid sequence. Further limit theorems involving more general sums of independent random variables will be found in Chapter 16.

Bert Fristedt, Lawrence Gray
Chapter 16. Infinitely Divisible Distributions as Limits

Recall that an infinitely divisible distribution is one that, for each n, is equal to Q n *n for some distribution Q n . In preceding chapters several infinitely divisible distributions have appeared. In particular all the stable distributions are infinitely divisible, as is easily seen by comparing the definitions of these two concepts. In this chapter we characterize all infinitely divisible characteristic functions. This characterization is based on the family of ‘compound Poisson distributions’, to be introduced in the first section.

Bert Fristedt, Lawrence Gray
Chapter 17. Stable Distributions as Limits

In Chapter 15, it was seen that if X1,X2,..., is an iid sequence of ℝ-valued random variables with finite mean and variance, then as a consequence of the Classical Central Limit Theorem, quantities like (17.1)<m:math display='block'> <m:mrow> <m:mi>P</m:mi><m:mo stretchy='false'>[</m:mo><m:mi>x</m:mi><m:mo>&#x2A7D;</m:mo><m:msub> <m:mi>X</m:mi> <m:mn>1</m:mn> </m:msub> <m:mo>+</m:mo><m:mn>...</m:mn><m:mo>+</m:mo><m:msub> <m:mi>X</m:mi> <m:mi>n</m:mi> </m:msub> <m:mo>&#x2A7D;</m:mo><m:mi>y</m:mi><m:mo stretchy='false'>]</m:mo> </m:mrow> </m:math> $$ P[x{X_1} + ... + {X_n}y] $$ can be estimated using the normal distribution function. It turns out that for many iid sequences without finite variances, or even without finite means, such a quantity can still be estimated using stable distribution functions. For instance, useful information can be obtained about the distribution of the nth return to 0 of a simple symmetric random walk on ℤ (see Problem 20).

Bert Fristedt, Lawrence Gray
Chapter 18. Convergence in Distribution on Polish Spaces

We want to extend the concept of convergence in distribution to probability spaces other than (ℝ, ??). Certain metric spaces, known as ‘Polish spaces’, play a central role. Particularly important examples of Polish spaces are the real line, the extended real line, d-dimensional Euclidean space, infinite products of intervals, and spaces of continuous functions. Thus, this chapter may be viewed as a mechanism for extending the concepts and results discussed in Chapter 14 to a wide variety of settings. (Basic facts about metric spaces are treated briefly in Appendix B. Some of the topology in Appendix C is also relevant.)

Bert Fristedt, Lawrence Gray
Chapter 19. The Invariance Principle and Brownian Motion

In this chapter, we bring together several of the key ideas of previous chapters to construct one of the most important objects in all of probability theory, namely ‘Brownian motion’. In order to apply the theory developed in Chapter 18, we will first take the point of view that Brownian motion is a random variable that takes values in the Polish space C[0,1] and later switch to the Polish space C[0, ∞). We will find that when a random walk on ℝ with finite variance is converted to a C[0, 1]-valued random variable in a natural way, then it can be centered and scaled so that its distribution approximates that of Brownian motion. As a consequence, much can be learned about the asymptotic properties of random walks by looking at the properties of Brownian motion. We rely frequently on the material concerning Polish spaces in Chapter 18. The Classical Central Limit Theorem (Chapter 15) and the Arzelà-Ascoli Theorem (Theorem 5 of Appendix B) also play important roles.

Bert Fristedt, Lawrence Gray

Conditioning

Frontmatter
Chapter 20. Spaces of Random Variables

This chapter is chiefly concerned with two metric spaces consisting of collections of random variables on a probability space (Ω, (ℱ, P): L1(Ω, ℱ, P), consisting of all random variables X: Ω → ℝ such that E(|X|) < ∞, and L2 (Ω, ℱ, P), consisting of those X for which E(X2) < ∞. The space L2(Ω, ℱ, P) has additional structure which makes it a ‘Hilbert space’. General Hilbert spaces are introduced in the first section, and L2(Ω, ℱ, P) is treated in second section. Basic results from these two sections will play an important role in the definition of conditional probability distributions in Chapter 21. The metric space L1(Ω, ℱ, P) is discussed briefly in the third section, and the final section of the chapter treats an application of Hilbert space methods to an estimation problem.

Bert Fristedt, Lawrence Gray
Chapter 21. Conditional Probabilities

In this chapter we introduce two closely related concepts: conditional probabilities and conditional probability distributions. These two mathematical objects are vital in the construction and analysis of nearly all of the random sequences and stochastic processes that form the chief subject matter of the latter part of this book (Chapters 24 through 33).

Bert Fristedt, Lawrence Gray
Chapter 22. Construction of Random Sequences

There are situations in which it is natural to construct probability spaces by first specifying certain conditional distributions and then showing that there is a unique underlying (unconditional) distribution consistent with those specifications. The tool for doing this is Theorem 3. Several specific examples are included along with three general classes of examples: exchangeable sequences, Markov sequences, and Polya urns.

Bert Fristedt, Lawrence Gray
Chapter 23. Conditional Expectations

Integration with respect to conditional distributions gives conditional expectations. A precise definition is given in the first section of this chapter, after which several equivalent formulations are given. An interesting sidelight is the proof, at the end of the first section, of the Radon-Nikodym Theorem. The remaining sections are devoted to various formulas and properties, some of which are analogous to properties obtained in Chapters 4, 5, and 8 for (unconditional) expectations. Conditional variances are also treated and a useful formula relating conditional and unconditional variances is proved.

Bert Fristedt, Lawrence Gray

Random Sequences

Frontmatter
Chapter 24. Martingales

We will treat what many would regard as the most important type of random sequence, for it is both intrinsically natural and also a tool for treating other topics in probability. Martingales are particularly important in the study of Markov sequences and Markov processes, as will be seen later in this book.

Bert Fristedt, Lawrence Gray
Chapter 25. Renewal Sequences

The main subject of this chapter is a class of random sequences that are defined in terms of random walks T = (T m : m = 0, 1, 2,...) in ℤ̄+ satisfying Tm+1 (ω) ≥ 1 + Tm(ω) (with the understanding that ∞ ≥ 1 + ∞).

Bert Fristedt, Lawrence Gray
Chapter 26. Time-homogeneous Markov Sequences

Markov sequences are so important that several chapters could be devoted to their study. We will mainly limit our coverage to two kinds of topics: (i) those that involve instructive applications of material presented earlier in this book and (ii) those that lay the groundwork for material on continuous-time Markov processes presented later in the book.

Bert Fristedt, Lawrence Gray
Chapter 27. Exchangeable Sequences

Recall from Definition 4 of Chapter 22 that a finite or infinite sequence is exchangeable if its distribution is invariant under permutations of its terms. Our main goal in this chapter is to develop some simple ways to describe all exchangeable sequences. We will see that an infinite sequence is exchangeable if and only if it is conditionally iid, and that finite exchangeable sequences can be described in terms of a certain urn model. The characterization results, known as De Finetti Theorems, were foreshadowed in Example 2 and Problem 9 of Chapter 22. Problem 10 of Chapter 22 provides an example of a finite exchangeable sequence that is not conditionally iid, thereby showing that the finite case cannot be derived from the infinite case. We focus on exchangeable sequences whose terms take values in a finite set, and describe how some of the formulas obtained for such sequences can be used to obtain information about unknown parameters. Later we generalize to sequences whose terms take values in a Borel space. In the last section, we introduce some important distributions that are naturally connected with infinite exchangeable sequences and an urn model.

Bert Fristedt, Lawrence Gray
Chapter 28. Stationary Sequences

Many of the random sequences studied so far in this book are related in some significant way to ‘stationary sequences’. Informally speaking, a random sequence is stationary if it models the successive states of some system that is in equilibrium. Thus, an important example of a stationary sequence is a Markov sequence whose initial distribution is an equilibrium distribution. Exchangeable sequences are also stationary. And there are many other important examples, some of which will be introduced in this chapter.

Bert Fristedt, Lawrence Gray

Stochastic Processes

Frontmatter
Chapter 29. Point Processes

Loosely speaking, a point process is a random ‘discrete’ set of points in some Polish space. Thus, one could use a point process to model experiments like throwing grains of sand onto the floor and noting their locations, or pointing an astronomical telescope in a random direction and noting the positions of the stars seen in the field of view. A mathematical example would be the random set of values taken by a finite sequence of random variables. This latter example makes it clear that we may want to generalize the notion of sets to allow a given point to appear more than once. It turns out that there is a nice mathematical way to accommodate the generalization using a certain class of ℤ̄+ -valued measures. The relevant definitions and basic facts are given in the first section. The most important point processes are ‘Poisson point processes’, which are characterized by the property that their intersections with disjoint subsets of the underlying Polish space are independent. These are treated in Sections 3 and 4. An important tool for studying the distributions of point processes is introduced in the fourth section. This tool is needed in the final two sections of the chapter, where various operations on point processes are studied. In particular, the convergence in distribution of point processes is considered in the final section. One nice result from that section is that the Poisson point processes arise as limits of certain naturally defined sequences.

Bert Fristedt, Lawrence Gray
Chapter 30. Lévy Processes

In this chapter we treat continuous-time analogues of random walks, while restricting ourselves to the state spaces E and E. In the E -setting, these ‘Lévy processes’ can be constructed using Poisson point processes. Matters are slightly more complicated in the ℝ-setting; in addition to Poisson point processes, Brown-ian motion and a limiting procedure are needed. The main results of this chapter show that there is a Lévy process for each infinitely divisible distribution. After these results are proved, we extend to the continuous-time setting some of the theory that was developed earlier for random walks. The chapter concludes with a section in which a few ‘sample function properties’ of Lévy processes are discussed.

Bert Fristedt, Lawrence Gray
Chapter 31. Introduction to Markov Processes

In many ways, continuous time is a more natural setting than discrete time for the study of random time evolutions. Fortunately, there is a close connection between the two settings, particularly in the case of Markovian time evolutions. In this chapter, we build on much of what was done for Markov sequences in Chapter 26 in order to develop the basic theory of Markov processes. Our main goal in doing so is to prepare the way for the final two chapters in which two of the most important classes of Markov processes are studied.

Bert Fristedt, Lawrence Gray
Chapter 32. Interacting Particle Systems

An ‘interacting particle system’ can be informally described as a Markov process consisting of countably many pure-jump processes that interact by modifying each other’s transition rates. Each individual pure-jump process in such a system is located at a ‘site’ and has state space {0,1,2,...,n}. The state of the pure-jump process at a given site is the number of ‘particles’ at that site, with n being the maximum particle number.

Bert Fristedt, Lawrence Gray
Chapter 33. Diffusions and Stochastic Calculus

A diffusion is a time-homogeneous continuous-in-time strong Markov process. Most often, the state space is ℝd, although other spaces are also considered, especially in current research.

Bert Fristedt, Lawrence Gray
Backmatter
Metadata
Title
A Modern Approach to Probability Theory
Authors
Bert Fristedt
Lawrence Gray
Copyright Year
1997
Publisher
Birkhäuser Boston
Electronic ISBN
978-1-4899-2837-5
Print ISBN
978-1-4899-2839-9
DOI
https://doi.org/10.1007/978-1-4899-2837-5