Elsevier

NeuroImage

Volume 32, Issue 1, 1 August 2006, Pages 180-194
NeuroImage

Reliability of MRI-derived measurements of human cerebral cortical thickness: The effects of field strength, scanner upgrade and manufacturer

https://doi.org/10.1016/j.neuroimage.2006.02.051Get rights and content

Abstract

In vivo MRI-derived measurements of human cerebral cortex thickness are providing novel insights into normal and abnormal neuroanatomy, but little is known about their reliability. We investigated how the reliability of cortical thickness measurements is affected by MRI instrument-related factors, including scanner field strength, manufacturer, upgrade and pulse sequence. Several data processing factors were also studied. Two test–retest data sets were analyzed: 1) 15 healthy older subjects scanned four times at 2-week intervals on three scanners; 2) 5 subjects scanned before and after a major scanner upgrade. Within-scanner variability of global cortical thickness measurements was <0.03 mm, and the point-wise standard deviation of measurement error was approximately 0.12 mm. Variability was 0.15 mm and 0.17 mm in average, respectively, for cross-scanner (Siemens/GE) and cross-field strength (1.5 T/3 T) comparisons. Scanner upgrade did not increase variability nor introduce bias. Measurements across field strength, however, were slightly biased (thicker at 3 T). The number of (single vs. multiple averaged) acquisitions had a negligible effect on reliability, but the use of a different pulse sequence had a larger impact, as did different parameters employed in data processing. Sample size estimates indicate that regional cortical thickness difference of 0.2 mm between two different groups could be identified with as few as 7 subjects per group, and a difference of 0.1 mm could be detected with 26 subjects per group. These results demonstrate that MRI-derived cortical thickness measures are highly reliable when MRI instrument and data processing factors are controlled but that it is important to consider these factors in the design of multi-site or longitudinal studies, such as clinical drug trials.

Introduction

Techniques that enable the in vivo MRI-derived quantitative measurement of properties of the human cerebral cortex, such as thickness, are beginning to demonstrate important potential applications in basic and clinical neuroscience. Changes in the gray matter that makes up the cortical sheet are manifested in normal aging (Jack et al., 1997, Salat et al., 1999, Salat et al., 2004, Sowell et al., 2003, Sowell et al., 2004), Alzheimer's disease (Dickerson et al., 2001, Thompson et al., 2003, Lerch et al., 2005), Huntington's disease (Rosas et al., 2002), corticobasal degeneration (Boeve et al., 1999), amyotrophic lateral sclerosis (Kiernan and Hudson, 1994), multiple sclerosis (Sailer et al., in press) and schizophrenia (Thompson et al., 2001, Kuperberg et al., 2003, Narr et al., 2005). Progressive thinning of the cortex follows a disease-specific regional pattern in certain diseases, such as Alzheimer's disease (Thompson et al., 2003); thus, in vivo cortical thickness measures could be useful as a biomarker of the evolution of the disease. Longitudinal imaging-based biomarkers of disease progression will likely be of great utility in evaluating the efficacy of disease-modifying therapies (Dickerson and Sperling, 2005).

Measurement of cortical thickness from MRI data is a non-trivial task. Manual thickness measurements are difficult to obtain due to the highly convoluted nature of the cortex. It can take a trained anatomist several days to manually label a high-resolution set of MR brain images, and even this labor-intensive procedure allows only the measurement of cortical volume, not cortical thickness, because the cortical thickness is a property that can only be properly measured if the location and orientation of both the gray/white and pial surfaces are known. To facilitate automatic thickness measurement, many computerized methods have been proposed in the literature for segmenting the cortex and finding the cortical surfaces from MRI data (Dale et al., 1999, Joshi et al., 1999, MacDonald et al., 1999, Xu et al., 1999, Zeng et al., 1999, Van Essen et al., 2001, Shattuck and Leahy, 2002, Sowell et al., 2003, Barta et al., 2005, Han et al., 2005a).

Although the validation of MRI-derived cortical thickness measurements has been performed against regional manual measurements derived from both in vivo and post-mortem brain scans (Rosas et al., 2002, Kuperberg et al., 2003, Salat et al., 2004), the reliability of measures of this fundamental morphometric property of the brain has received relatively little systematic investigation (Fischl and Dale, 2000, Rosas et al., 2002, Kuperberg et al., 2003, Sowell et al., 2004, Lerch and Evans, 2005). Most of these studies approach reliability by comparing thickness measurements across different subjects or by performing repeated scans on a few subjects acquired within the same scan session or within very short scan intervals (for example, the subjects were removed from the scanner and then scanned again in 5 min (Sowell et al., 2004)). This approach may greatly underestimate the sources of variability within and between studies.

Variability in MRI-derived morphometric measures may result from subject-related factors, such as hydration status (Walters et al., 2001), instrument-related factors, such as field strength, scanner manufacturer or pulse sequence, or data-processing-related factors, including not only software package but also the parameters chosen for analysis. All of these factors may contribute to differences between typical cross-sectional studies (e.g., when interpreting differences between two studies of patients with Alzheimer's disease vs. controls scanned at a single time point on one scanner). Longitudinal studies of normal development, aging or disease progression face additional challenges associated with both subject-related factors as well as instrument-related factors (e.g., major scanner upgrades). For multi-site studies, it is critical to understand and adjust for instrument-related differences between sites, such as scanner manufacturer, field strength and other hardware components. Finally, longitudinal multi-center studies, such as the Alzheimer's Disease Neuroimaging Initiative, must contend with all of these factors while attempting to detect subtle effects. Thus, detailed quantitative data regarding the degree to which each of the factors outlined above contribute to variability in cortical thickness (and other measures) could be very helpful for both study design and interpretation. Unfortunately, little work in this area has been performed.

Specifically, it is not yet clear how cortical thickness measures vary as a function of MRI instrument-related factors, such as field strength, scanner manufacturer and scanner software and hardware upgrades. Knowledge of the degree to which different MRI instrument-related factors affect the reliability of cortical thickness measures is essential for the interpretation of these measures in basic and clinical neuroscientific studies. Furthermore, this knowledge is critical if cortical thickness measures are to find applications as biomarkers in clinical trials of putative treatments for neurodegenerative or other neuropsychiatric diseases.

We undertook this study to evaluate the reliability of a cortical thickness measurement method both within and across different scanner platforms and field strengths, with the goal of quantitatively identifying the factors that are the greatest contributors to cortical thickness variability. Two groups of test–retest data sets were acquired and analyzed. In the first data set, 15 healthy older subjects were scanned four times at 2-week intervals on three different scanner platforms (test scan on Siemens Sonata 1.5 T, re-test scan on Siemens Sonata 1.5 T, cross-site scan on GE Signa 1.5 T, cross-field-strength scan on Siemens Trio 3 T). Older participants were studied so that anatomical variability related to atrophy and age-related signal changes was represented. The 2-week interval was chosen so that elements of variability related to subject hydration status and minor instrument drift would be included, which may be artificially minimized when the test–retest interval is several min to ∼1 day. First, the test–retest reliability of cortical thickness measurements was investigated from the two Siemens Sonata sessions. Next, analyses were performed on the effects of various instrument-related factors, including: a) different MR manufacturer (Siemens vs. GE); b) different field strength (1.5 T vs. 3 T); c) different pulse sequences (MPRAGE versus multiple flip angle, multi-echo FLASH); d) different number of data acquisitions (one MPRAGE vs. two averaged MPRAGE acquisitions). Finally, effects of several data processing-related factors were analyzed, including: a) different levels of spatial smoothing of the raw thickness maps; and b) different processing schemes (cross-sectional versus longitudinal).

The second data set consisted of 5 healthy younger subjects scanned repeatedly before and after a major scanner upgrade, with the goal of evaluating the reliability of thickness measurements in longitudinal studies that contend with scanner upgrades.

In this study, thickness measurements were performed using the FreeSurfer software package, which is an automated method for cortical surface reconstruction and thickness computation. Although comparison with other thickness measurement methods is beyond the scope of this paper, the effects of several aspects of the processing system within FreeSurfer were studied as noted above.

Section snippets

Data acquisition

Two groups of test–retest data sets were acquired and analyzed to characterize the reliability of cortical thickness estimation.

The first group of test–retest data consists of MRI scans acquired from 15 healthy older subjects (age between 66 and 81 years; mean: 69.5 years; SD: 4.8 years. 8 males, 7 females). All participants provided informed consent in accordance with the Human Research Committee of Massachusetts General Hospital. Each subject underwent 4 scan sessions at 2-week intervals (two

Reliability of global cortical thickness measure

Global thickness measures for each subject are presented in Table 1, where the mean and standard deviation were computed over the whole cortical surface combining left and right hemispheres. Overall, the range of cortical thickness values is similar for each individual subject across different scan sessions. The reproducibility of global mean cortical thickness is further demonstrated in Fig. 1 for the four test–retest comparisons. As can be seen, the global mean thickness is an exceedingly

Conclusion

The purpose of this study is to evaluate the reliability (precision) of an automated thickness measurement method both within- and across-scanner platforms and field strength. We also evaluated the effects on thickness measurement reliability of different imaging acquisition protocols (including number of acquisitions and imaging sequences) and different data processing or post-processing (smoothing of thickness map) schemes. Finally, we investigated the impact of a scanner upgrade on thickness

Acknowledgments

This research was supported by the following grants: a) NCRR Morphometry Biomedical Informatics Research Network (U24 RR021382), b) NCRR P41-RR14075 and RO1-RR16594-01A1, c) Pfizer Inc., d) the NIA (K23-AG22509 and P01-AG04953) and e) the MIND Institute.

References (48)

  • D.W. Shattuck et al.

    BrainSuite: an automated cortical surface identification tool

    Med. Image Anal.

    (2002)
  • R.G. Steen et al.

    More than meets the eye: significant regional heterogeneity in human cortical T1

    Magn. Reson. Imaging

    (2000)
  • A. Van der Kouwe et al.

    On-line automatic slice positioning for brain MR imaging

    NeuroImage

    (2005)
  • P. Barta et al.

    A stochastic model for studying the laminar structure of cortex from MRI

    IEEE Trans. Med. Imag.

    (2005)
  • Benner, T., Wisco, J.J., van der Kouwe, A., Fischl, B., Vangel, M.G., Hochberg, F.H., Sorensen, A.G., in press....
  • B.F. Boeve et al.

    Pathologic heterogeneity in clinically diagnosed corticobasal degeneration

    Neurology

    (1999)
  • V. Braitenberg et al.

    Anatomy of the Cortex

    (1991)
  • J. Cohen

    Statistical Power Analysis for the Behavioral Sciences

    (1988)
  • R.O. Duda et al.

    Pattern Classification

    (2001)
  • B. Fischl et al.

    Measuring the thickness of the human cerebral cortex from magnetic resonance images

    Proc. Natl. Acad. Sci. U. S. A.

    (2000)
  • B. Fischl et al.

    High-resolution intersubject averaging and a coordinate system for the cortical surface

    Hum. Brain Mapp.

    (1999)
  • B. Fischl et al.

    Automated manifold surgery: constructing geometrically accurate and topologically correct models of the human cerebral cortex

    IEEE Trans. Med. Imag.

    (2001)
  • B. Fischl et al.

    Sequence-independent segmentation of magnetic resonance images

    NeuroImage

    (2004)
  • B. Fischl et al.

    Automatically parcellating the human cerebral cortex

    Cereb. Cortex

    (2004)
  • Cited by (1233)

    View all citing articles on Scopus
    1

    These authors contributed equally to this work.

    View full text