Abstract
Health Related Quality of Life (HRQoL) measures are becoming more frequently used in clinical trials and health services research, both as primary and secondary endpoints. Investigators are now asking statisticians for advice on how to plan and analyse studies using HRQoL measures, which includes questions on sample size. Sample size requirements are critically dependent on the aims and objectives of the study, the proposed summary measure and effect size and the method of calculating the test statistic.
We present a tutorial on methods of sample size calculation for HRQoL outcomes. We also briefly review the HRQoL literature to see what has been done in practice. The aim of this tutorial is provide pragmatic guidance to researchers on the methods of calculating sample sizes when using HRQoL measures as outcomes.
HRQoL measures such as the SF-36, NHP and QLQ-C30 are usually measured on an ordered categorical (ordinal) scale. We argue that it is often incorrect to treat the scales as if they were continuous and normally distributed and that the mean score may not be a good summary measure of HRQoL data. However the ordinal scaling of HRQoL measures leads to problems in determining sample size, and we suggest that the odds ratio (OR) may be a more suitable summary measure for comparing groups (rather than the mean difference) and therefore methods suitable for ordinal data be used for analysis.
The frequency distribution of HRQoL scores should be assessed to see if parametric assumptions are satisfied and whether or not the sample mean is a good summary measure of the data. Given the non-normal distribution of the majority HRQoL outcome measures, summary measures such as means and standard deviations are difficult to interpret. Thus standardised differences (effect sizes) and parametric methods may not be a suitable basis for calculation of sample size. Finally we argue, that any sample size calculation (with all its attendant assumptions) leads to better research than no sample size calculation at all.
Similar content being viewed by others
References
D. G. Altman, Practical Statistics for Medical Research, London: Chapman & Hall, 1991.
D. G. Altman, D. Machin, T. N. Bryant and M. J. Gardner, Statistics with Confidence. Confidence intervals and statistical guidelines, 2nd Ed., London: British Medical Journal, 2000.
P. Armitage and G. Berry, Statistical Methods in Medical Research, 3rd Ed., Oxford: Blackwell Science, 1994.
J. M. Bland and D. G. Altman, “The use of transformation when comparing two means,” British Medical Journal, 312, p. 1153, 1996.
K. Bolland, M. R. Sooriyarachchi and J. Whitehead, “Sample size review in a head injury trial with ordered categorical responses,” Statistics in Medicine, 17, pp. 2835–2847, 1998.
R. Brant, “Assessing proportionality in the proportional odds model for ordinal logistic regression,” Biometrics, 46, pp. 1171–1178, 1990.
J. E. Brazier, R. Harper, N. M. B. Jones, A. O'Cathain, K. J. Thomas, T. Usherwood and L. Westlake, “Validating the SF-36 health survey questionnaire: new outcome measure for primary care,” British Medical Journal, 305, pp. 160–164, 1992.
M. J. Campbell, S. A. Julious and D. G. Altman, “Estimating sample sizes for binary, ordered categorical, and continuous outcomes in 2 group comparisons,” British Medical Journal, 311, pp. 1145–1148, 1995.
M. J. Campbell, S. A. Julious and S. L. George, “Estimating sample sizes for studies using the SF-36 health survey-Reply,” Journal of Epidemiology & Community Health, 50, pp. 473–474, 1996.
J. Cohen, Statistical Power Analysis for the Behavioral Sciences, New Jersey: Lawrence Earlbaum, 1988.
P. M. Fayers and D. Machin, Quality of Life Assessment, Analysis and Interpretation, Chichester: Wiley, 2000.
J. F. Hilton, “The appropriateness of the Wilcoxon test in ordinal data,” Statistics in Medicine, 15, pp. 631–645, 1996.
J. F. Hilton and C. R. Mehta, “Power and sample size calculations for exact conditional tests with ordered categorical data,” Biometrics, 49, pp. 609–616, 1993.
R. V. Hogg and E. A. Tanis, Probability and Statistical Inference, 3rd Ed. New York: McMillan, 1988.
S. A. Julious and M. J. Campbell, “Sample sizes calculations for ordered categorical data,” Statistics in Medicine, 15, pp. 1065–1066, 1996.
S. A. Julious and M. J. Campbell, “Sample size calculations for paired or matched ordinal data,” Statistics in Medicine, 17, pp. 1635–1642, 1998.
S. A. Julious, S. George, D. Machin and R. J. Stephens, “Sample sizes for randomized trials measuring quality of life in cancer patients,” Quality of Life Research, 6, pp. 109–117, 1997.
S. A. Julious, S. George and M. J. Campbell, “Sample sizes for studies using the short form 36 (SF-36),” Journal of Epidemiology & Community Health, 49, pp. 642–644, 1995.
M. T. King, “The interpretation of scores from the EORTC quality of life questionnaire QLQ-C30,” Quality of Life Research, 5, pp. 555–567, 1996.
J. E. Kolassa, “A comparison of size and power calculations for the Wilcoxon statistic for ordered categorical data,” Statistics in Medicine, 14, pp. 1577–1581, 1995.
A. Laupacis, D. L. Sackett and R. S. Roberts, “An assessment of clinically useful measures of the consequences of treatment,” N. Engl. J. Med., 317, pp. 1728–1733, 1988.
E. Lesaffre, I. Scheys, J. Frohlich and E. Bluhmki, “Calculation of power and sample size with bounded outcome scores,” Statistics in Medicine, 12, pp. 1063–1078, 1993.
D. Machin, M. J. Campbell, P. M. Fayers and A. J. Y. Pinol, Sample Sizes Tables for Clinical Studies, Oxford: Blackwell Science, 1997.
D. Hays and P. M. Fayers, Eds., Quality of Life Assessment in Clinical Trials: Methods and Practice, Oxford: Oxford University Press, pp. 37–50, 1998.
C. J. Morrell, H. Spiby, P. Stewart, S. Walters and A. Morgan, “Costs and effectiveness of community postnatal support workers: randomised controlled trial,” British Medical Journal, 321, pp. 593–598, 2000.
B. Peterson and F. E. Harrell, “Partial proportional odds models for ordinal response variables,” Applied Statistics, 39, pp. 205–217, 1990.
L. Prieto, J. Alonso and J. M. Anto, “Estimating sample sizes for studies using the SF-36 health survey,” Journal of Epidemiology & Community Health, 50, p. 473, 1996.
M. Roset, X. Badia and N. E. Mayo, “Sample size calculations in studies using the EuroQol 5D,” Quality of Life Research, 8, pp. 539–549, 1999.
J. Whitehead, “Sample size calculations for ordered categorical data [published erratum appears in Stat Med 1994 Apr 30; 13(8): 871],” S tatistics in Medicine, 12, pp. 2257–2271, 1993.
P. Williamson, J. L. Hutton, J. Bliss, J. Blunt, M. J. Campbell and R. Nicholson, “Statistical review by research ethics committees,” J. Roy. Statist. Soc. A, 163, pp. 5–13, 2000.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Walters, S.J., Campbell, M.J. & Paisley, S. Methods for Determining Sample Sizes for Studies Involving Health-Related Quality of Life Measures: A Tutorial. Health Services & Outcomes Research Methodology 2, 83–99 (2001). https://doi.org/10.1023/A:1020102612073
Issue Date:
DOI: https://doi.org/10.1023/A:1020102612073