Skip to main content

2008 | Buch

Statistical Design

insite
SUCHEN

Über dieses Buch

Statistical design is one of the fundamentals of our subject, being at the core of the growth of statistics during the previous century. Design played a key role in agricultural statistics and set down principles of good practic, principles that still apply today. Statistical design is all about understanding where the variance comes from, and making sure that is where the replication is. Indeed, it is probably correct to say that these principles are even more important today.

Inhaltsverzeichnis

Frontmatter
1. Basics
This is a book about design, and is typically not concerned with analysis. Most designs, unless they are complete disasters, will result in a reasonably straightforward analysis. However, as the purpose of a good design is to result in an efficient analysis, it is important to be familiar with the types of analysis that will be done. Thus, we will spend some time discussing the important parts of analyses, and how the design can impact them. We will also do many analyses and somewhat address what to do when the design does not go as planned.
Throughout the book analyses will typically be presented in an anova framework, complete with anova tables, sums of squares and degrees of freedom. This is done not because the anova is the best way to analyze data, but rather because the anova is the best way to think about data and plan designs. Fisher (1934) first called the anova “a convenient method of arranging the arithmetic”, but then explained that it is quite a bit more than that, as rigorously demonstrated by Speed (1987). The ideas of partitioning variation, counting degrees of freedom correctly, and identifying the correct error terms, are fundamental to any data analysis. Focusing on the anova helps us focus on these ideas, and ultimately helps us plan a better design.
This first chapter is a collection of “basics”, topics which should seem a bit familiar, but the explanations and interpretations may be somewhat different from what was previously seen. However, since we are assuming some familiarity with these topics, the review will be brief and a little disjointed.
2. Completely Randomized Designs
A theoretical consequence of the fact that all factors in a CRD must be fixed factors is that we can study the theory for all CRD simply by looking at the oneway CRD. This follows because of the simple error structure of the CRD, and the fact that any effect can be built up through contrasts. However, this fact is only useful as a theoretical tool, say when we are trying to develop some distributional properties, as the data layout and the treatment structure always play an important practical role. But the oneway CRD is the place to start.
Most importantly, the randomization structure of the CRD implies that there is only one error term, the within error, and all effects are tested against it.
3. Complete Block Designs
Just as a oneway anova is a generalization of a two-sample t-test, a randomized complete block (RCB) design is a generalization of a paired t-test. In this first section we review some basics and do a small example, and show how to build up an RCB from pairwise t-tests.
In this book we discuss two types of block effects, fixed and random. In most textbooks blocks are treated as a random effect without much discussion of options, but there are clear instances where blocks are not random (see Example 4.1). However, in such cases these factors are still blocks because of the randomization pattern they induce and, in particular, the covariance structure they induce. We focus on this, and look very carefully at how to model the covariance, which we find is the overwhelmingly important concern. Whether the block is fixed or random is a function of the particular experiment, as long as the covariance is correctly accounted for then valid inferences can be drawn.
In this chapter we will mainly concentrate on the classical approach with the blocks considered as random, leaving details of fixed blocks models and implications to Chapter 4.
4. Interlude: Assessing the Effects of Blocking
In Chapter 3 we modeled blocks as a random factor, one in which the levels that actually appear in the experiment are considered a random sample from all levels. However, the concept of “random factor” can sometimes be puzzling, as most of the time we do not actually take a random sample of blocks. Rather, we choose blocks to represent a wide variety of situations. In a sense the concept of a random factor is a fallacy (see Section 3.8.4). That is, the important implication is that blocking induces a correlation in the design. This only makes sense, as experimental units within a particular block should behave similarly, and hence will be correlated. This correlation can be modeled directly, or can arise as a byproduct of assuming that blocks are a random effect. In either case we end up with the similar analyses.
Whether blocks are random or are fixed, the important point is that a correlation structure is induced. As far as the model calculations go – variances, covariances, etc, they are quite similar.
5. Split Plot Designs
Split plot experiments are the workhorse of statistical design. There is a saying that if the only tool you own is a hammer, then everything in the world looks like a nail. It might be fair to say that, from now on, almost every design that you see will be some sort of split plot.
A split plot design (or split unit design) is one in which there is more than one type of experimental unit. Although split unit is probably the more accurate term, this design also grew out of agriculture, and the historical term seems to be the more popular one.
6. Confounding in Blocks
Thus far, all of the designs we have looked at have been complete in that every treatment has appeared in every block. This is the best situation and gives us the best information for treatment comparisons. However, there are many situations where we cannot put every treatment in every block (often due to time, money, or physical constraints of the experiment). For example, a microarray experiment using a two-dye chip is restricted to two treatments per block (microarray). In these cases the design becomes incomplete in that not every treatment is in every block.
If the design is incomplete, then we immediately are faced with the fact that treatment comparisons are confounded with block effects which, of course, will cause problems. There is the obvious problem that the block difference may affect treatment comparisons, and we also have the problem that block variances could creep into the variance of a treatment comparison. The point of this chapter is to see how to deal with incomplete designs so that we can mitigate these problems.
Backmatter
Metadaten
Titel
Statistical Design
verfasst von
George Casella
Copyright-Jahr
2008
Verlag
Springer New York
Electronic ISBN
978-0-387-75965-4
Print ISBN
978-0-387-75964-7
DOI
https://doi.org/10.1007/978-0-387-75965-4

Premium Partner