Elsevier

Medical Image Analysis

Volume 17, Issue 8, December 2013, Pages 1293-1303
Medical Image Analysis

Regression forests for efficient anatomy detection and localization in computed tomography scans

https://doi.org/10.1016/j.media.2013.01.001Get rights and content

Abstract

This paper proposes a new algorithm for the efficient, automatic detection and localization of multiple anatomical structures within three-dimensional computed tomography (CT) scans. Applications include selective retrieval of patients images from PACS systems, semantic visual navigation and tracking radiation dose over time.

The main contribution of this work is a new, continuous parametrization of the anatomy localization problem, which allows it to be addressed effectively by multi-class random regression forests. Regression forests are similar to the more popular classification forests, but trained to predict continuous, multi-variate outputs, where the training focuses on maximizing the confidence of output predictions. A single pass of our probabilistic algorithm enables the direct mapping from voxels to organ location and size.

Quantitative validation is performed on a database of 400 highly variable CT scans. We show that the proposed method is more accurate and robust than techniques based on efficient multi-atlas registration and template-based nearest-neighbor detection. Due to the simplicity of the regressor’s context-rich visual features and the algorithm’s parallelism, these results are achieved in typical run-times of only ∼4 s on a conventional single-core machine.

Graphical abstract

Highlights

► Accurate and very fast anatomy localization in CT scans. ► Simultaneous detection of 26 anatomical structures. ► Uses a parallel regression forest algorithm. ► Assessed on a large clinical database of CT images. ► Compared with a number of state of the art algorithms.

Introduction

This paper proposes a new, parallel algorithm for the efficient detection and localization of anatomical structures (‘organs’) in 3D computed tomography studies. Localizing anatomical structures is an important step for many subsequent image analysis tasks (possibly organ-specific) such as segmentation, registration and classification. It is also crucial for managing database systems and creating intelligent navigation and visualization tools. For instance, one application is the efficient retrieval of selected portions of patients’ scans from PACS databases. When a physician wishes to inspect a particular organ, the ability to determine its position and extent automatically means that it is not necessary to retrieve the entire scan (which could comprise hundreds of MB of data) but a smaller region of interest. Thus it is possible to achieve faster user interaction while making economical use of the limited bandwidth. The proposed organ localizer could potentially be used also for tracking the amount of radiation absorbed by each organ over time. However, in its current form, the approximate representation of organs would produce indicative dose estimations.

The main contribution of this work is a new parametrization of the anatomy localization task as a multivariate, continuous parameter estimation problem. This is addressed effectively via tree-based, non-linear regression. Unlike the popular classification forests (often referred to simply as “random forests”), regression forests (Breiman et al., 1984) have not yet been used in medical image analysis. Our approach is fully probabilistic and, unlike previous techniques, e.g. (Zhou et al., 2007, Fenchel et al., 2008), is trained to maximize the confidence of output predictions. As a by-product, our method produces salient anatomical landmarks; i.e. automatically selected “anchor” regions that help localize organs of interest with high confidence. Our algorithm can localize both macroscopic anatomical regions1 (e.g. abdomen, thorax, trunk, etc.) and smaller scale structures (e.g. heart, l. adrenal gland, femoral neck, etc.) using a single, efficient model, c.f. (Feulner et al., 2009).

Motivated mostly by the semantic navigation use-case scenario, our focus in this paper is on both accuracy of prediction and speed of execution. Our goal is to achieve accurate anatomy localization in seconds on a conventional machine.

Regression approaches. Regression algorithms (Hardle, 1990) estimate functions which map input variables to continuous outputs.2 The regression paradigm fits the anatomy localization task well. In fact, its goal is to learn the non-linear mapping from voxels directly to organ position and size.

The first work to use regression for anatomy localization in images is Zhou et al. (2005). There, the authors need to define the non-linear mapping as an analytical function whose exact form is learned via regularized boosting. They also present a thorough overview of different regression techniques and discuss the superiority of boosted regression. In their later work (Zhou et al., 2007), their boosted regression technique was improved by incorporating high degree-of-freedom weak learners. The main difference between that approach and the one presented here is in the non-linear mapping. Defining a regression function analytically as done in Zhou et al., 2005, Zhou et al., 2007 has two major drawbacks: (1) the definition of the function requires critical modeling assumptions for the type of the weak learner and the regularization term, and (2) obtaining a confidence measure for the regression output is non-trivial. In contrast, our approach does not assume an analytical form for the mapping. This results in a simpler formulation with fewer modeling choices. In addition, the probabilistic nature of our method yields a natural way of associating confidence with the predicted output. In fact, the training phase of our algorithm directly maximizes the confidence of the predicted probability distribution.

A comparison between boosting, forests and cascades is found in Yin et al. (2007). To our knowledge, so far only two papers have used regression forests in imaging (Montillo and Ling, 2009, Gall and Lempitsky, 2009), neither with application to medical image analysis. For instance, Gall and Lempitsky (2009) address the problem of detecting pedestrians vs. background. For the readers who might not be familiar with regression forests we provide a short explanation in the appendix. Also, a detailed description of general decision forests and their applications may be found in Criminisi and Shotton (2013), with free research code and demos available at http://research.microsoft.com/projects/decisionforests.

Classification-based approaches. In Zhan et al. (2008) organ detection is achieved via a confidence maximizing sequential scheduling of multiple, organ-specific classifiers. In contrast, our single, tree-based regressor allows us to deal naturally with multiple anatomical structures simultaneously. As shown in the machine learning literature (Torralba et al., 2007) this encourages feature sharing and, in turn better generalization. In Seifert et al. (2009) a sequence of probabilistic boosting tree (PBT) classifiers (first for salient slices, then for landmarks) are used. In contrast, our single regressor maps directly from voxels to organ poses; latent, salient landmark regions are extracted as a by-product. In Criminisi et al. (2009) the authors achieve localization of organ centers but fail to estimate the organ extent (similar to Gall and Lempitsky (2009)). Here we present a more direct, continuous model which estimates the position of the walls of the bounding box containing each organ; thus achieving simultaneous organ localization and extent estimation.

Marginal Space Learning. One of the most popular approaches for object localization in medical images is Marginal Space Learning (MSL) proposed in Zheng et al., 2007, Zheng et al., 2009a. MSL has been demonstrated to be very useful in practice (Zheng et al., 2009b, Barbu et al., 2012). However, that algorithm has three limitations. Firstly, MSL is designed to detect a single object at a time and extending it to the joint-localization of multiple objects (e.g. more than 20) is not immediate. For example, existing extensions rely on applying the algorithm iteratively, one run for each object of interest. The order of detection is either determined through combinatorial optimization or driven by the confidence values each object attains during the detection phase (Liu et al., 2010). In contrast, our method achieves joint-localization of any number of structures without modification and without worrying about complex ordering strategies.

Secondly, MSL builds upon multiple classification stages. For instance, to detect the position of the heart we may need: (1) a classifier trained to estimate overall translation, (2) a classifier trained on translation and rotation, and (3) yet another classifier trained on translation, rotation and scale. All three classifiers need be applied for each organ in a sequence. For e.g. 20 organs we would need to train 20 × 3 = 60 different classifiers, with clear scalability issues. In contrast, we propose using a single forest regressor (with e.g. only ∼4 trees) to deal with multiple organs (here tested on 26 anatomical structures).

Thirdly, we argue that solving a localization problem via classification is not optimal. In MSL, binary classifiers are run in a sliding-window fashion. For each point the classifier produces a positive answer (point is “close” to the structure) or a negative one (point is “far” from the structure). But reducing real-valued distances to binary decisions introduces a loss. Also, defining positive and negative examples is an ambiguous task. Instead, our regression forest directly estimates the 3D displacement of each voxel from the target regions. On the flip side, it is also true that in practice learning good classifiers seems to be easier than learning good regressors. This may be due to the fact that as a community we have had much more exposure to classification tasks than regression ones. This paper shows that for the application of anatomical bounding box localization using a regression forest can be more accurate than using a classification approach.

Registration-based approaches. Although atlas-based methods have enjoyed much popularity (Fenchel et al., 2008, Shimizu et al., 2006, Yao et al., 2006), their conceptual simplicity belies the technical difficulty inherent in achieving robust, inter-subject registration. Robustness may be improved by using multi-atlas techniques (Isgum et al., 2009) but only at the expense of multiple registrations and hence increased computation time. Our algorithm incorporates atlas information within a compact tree-based model. As shown in the results section, such model is more efficient than keeping around multiple atlases and achieves anatomy localization in only a few seconds. Comparisons with global affine atlas registration methods (similar to ours in computational cost) show that our algorithm produces lower errors and more stable predictions. Next we describe details of our approach.

Section snippets

Multivariate regression forests for organ localization

This section presents mathematical notation, problem parametrization and other details of our multi-organ regression forest with application to anatomy localization in CT images.

Mathematical notation. Vectors are represented in boldface (e.g. v), matrices as teletype capitals (e.g. Λ), and sets in calligraphic style (e.g. S). The position of a voxel in a CT volume is denoted v = (vx, vy, vz).

The labeled database. The 26 anatomical structures we wish to recognize are C={abdomen, l. adrenal gland, r.

Results, comparisons and validation

This section assesses the proposed algorithm in terms of accuracy, runtime speed, and memory efficiency; and compares it to alternative techniques.

Conclusion

Anatomy localization has been cast here as a non-linear regression problem where all voxel samples vote for the position of all anatomical structures. Location estimates are obtained by a multivariate regression forest algorithm that is shown to be more accurate and efficient than competing registration-based and template-matching techniques.

At the core of the algorithm is a new information-theoretic metric for regression tree learning which works by maximizing the confidence of the predictions

References (29)

  • A. Barbu et al.

    Automatic detection and segmentation of lymph nodes from ct data

    IEEE Trans. Med. Imaging

    (2012)
  • L. Breiman et al.

    Classification and Regression Trees

    (1984)
  • A. Criminisi et al.

    Decision Forests for Computer Vision and Medical Image Analysis

    (2013)
  • Criminisi, A., Shotton, J., Bucciarelli, S., 2009. Decision forests with long-range spatial context for organ...
  • Fenchel, M., Thesen, S., Schilling, A., 2008. Automatic labeling of anatomical structures in MR fastview images using a...
  • Feulner, J., Zhou, S.K., Seifert, S., Cavallaro, A., Hornegger, J., Comaniciu, D., 2009. Estimating the Body Portion of...
  • Gall, J., Lempitsky, V., 2009. Class-specific Hough forest for object detection. In: IEEE CVPR,...
  • Gueld, M.O., Kohnen, M., Keysers, D., Schubert, H., Wein, B.B., Bredno, J., Lehmann, T.M., 2002. Quality of DICOM...
  • W. Hardle

    Applied Non-Parametric Regression

    (1990)
  • T.K. Ho

    The random subspace method for constructing decision forests

    IEEE Trans. PAMI

    (1998)
  • I. Isgum et al.

    Multi-atlas-based segmentation with local decision fusionapplication to cardiac and aortic segmentation in ct scans

    IEEE Trans. Med. Imaging

    (2009)
  • S. Klein et al.

    elastix: a toolbox for intensity-based medical image registration

    IEEE Trans. Med. Imaging

    (2010)
  • Konukoglu, E., Criminisi, A., Pathak, S., Robertson, D., White, S., Haynor, D., Siddiqui, K., 2011. Robust linear...
  • Liu, D., Zhou, K., Bernhardt, D., Comaniciu, D., 2010. Search strategies for multiple landmark detection by submodular...
  • Cited by (227)

    • Machine learning and lumbar spondylolisthesis

      2023, Seminars in Spine Surgery
    View all citing articles on Scopus
    View full text