“A circus owner is planning to ship his 50 adult elephants and so he needs a rough estimate of the total weight of the elephants ”…, so begins Example 3 in Basu (1971), the most colorful and striking illustration of Basu’s challenges to the design-based analysis of sample survey data. The full story is included in the box for easy reference. The owner decides to take a sample of size n=1 (“As weighing an elephant is a cumbersome process”) and is talked out of a non-random sample (select Sambo, the elephant that had the average weight 3 years before) and the model-based estimate (50
) into an unequal probability sample (select Sambo with probability 99/100 and any of the other elephants with probability 1/4900) and the Horvitz-Thompson estimator (100
/99 if Sambo is selected and 4900
if any other elephant is selected). The point of the story is summarised in Figure 1 which shows the log-sampling distributions (i.e. the sampling distributions of the log of the estimators) for samples of size 1 of the model-based estimator and the Horvitz-Thompson estimator for a troupe of 50 elephants. (We plot the log-sampling distributions to improve the visual impact.) On this scale, the model-based estimator is very close to the actual total weight (indicated by an arrow) but, and this is Basu’s elegantly made point, the design-unbiased Horvitz-Thompson is far from the actual total weight in every possible sample. The design-based optimality of the Horvitz-Thompson estimator is no consolation to either the circus owner or the “unhappy statistician” who, Basu tells us, “lost his circus job (and perhaps became a teacher of statistics!)”.