Elsevier

Medical Image Analysis

Volume 18, Issue 8, December 2014, Pages 1262-1273
Medical Image Analysis

Encoding atlases by randomized classification forests for efficient multi-atlas label propagation

https://doi.org/10.1016/j.media.2014.06.010Get rights and content

Highlights

  • We propose encoding image atlases by randomized classification forests.

  • Scheme requires only a single registration for labeling of one target.

  • Increased efficiency through properties of the proposed encoding.

  • Evaluation of accuracy on 4 publicly available datasets.

Abstract

We propose a method for multi-atlas label propagation (MALP) based on encoding the individual atlases by randomized classification forests. Most current approaches perform a non-linear registration between all atlases and the target image, followed by a sophisticated fusion scheme. While these approaches can achieve high accuracy, in general they do so at high computational cost. This might negatively affect the scalability to large databases and experimentation. To tackle this issue, we propose to use a small and deep classification forest to encode each atlas individually in reference to an aligned probabilistic atlas, resulting in an Atlas Forest (AF). Our classifier-based encoding differs from current MALP approaches, which represent each point in the atlas either directly as a single image/label value pair, or by a set of corresponding patches. At test time, each AF produces one probabilistic label estimate, and their fusion is done by averaging. Our scheme performs only one registration per target image, achieves good results with a simple fusion scheme, and allows for efficient experimentation. In contrast to standard forest schemes, in which each tree would be trained on all atlases, our approach retains the advantages of the standard MALP framework. The target-specific selection of atlases remains possible, and incorporation of new scans is straightforward without retraining. The evaluation on four different databases shows accuracy within the range of the state of the art at a significantly lower running time.

Introduction

Labeling of healthy human brain anatomy is a crucial prerequisite for many clinical and research applications. Due to the involved effort (a fully manual labeling of a single brain takes 2–3 days (Klein and Tourville, 2012)), and increasing database sizes (e.g. ADNI, IXI, OASIS), a lot of research has been devoted to develop automatic methods for this task. While brain labeling is a general segmentation task (with a high number of labels), the standard approach for this task is multi-atlas label propagation (MALP) – see (Landman and Warfield, 2012) for an overview of the state of the art. With the atlas denoting a single labeled scan, MALP methods first derive a set of label proposals for the target image, each based on a single atlas, and then combine these proposals into a final estimate.

Currently, there are two main strategies for estimating atlas-specific label proposals. The first and larger group of methods non-linearly aligns each of the atlas images to the target image, and then – assuming one-to-one correspondence at each point – uses the atlas labels directly as label proposals, cf. e.g. (Rohlfing et al., 2004, Warfield et al., 2004, Heckemann et al., 2006). The second group of patch-based methods has recently enjoyed increased attention (Coupé et al., 2011, Rousseau et al., 2011, Wu et al., 2012). Here, the label proposal is estimated for each point in the target image by a local similarity-based search in the atlas. Patch-based approaches relax the one-to-one assumption, and aim at reducing the computational times by using linear instead of deformable alignment (Coupé et al., 2011, Rousseau et al., 2011), resulting in labeling running times of 22–130 min per target on the IBSR dataset (Rousseau et al., 2011). The fusion step, which combines the atlas-specific label proposals into a final estimate, aims to correct for inaccurate registration or labelings. While label fusion is a very active research topic, it is not the focus of this work. Additionally, some approaches perform further refinement, e.g. by learning classifiers for fine-scale class-based correction (Wang et al., 2012).

While current state of art techniques can achieve high levels of accuracy, in general they are computationally demanding. This is primarily due to the non-linear registration between all atlases and the target image, combined with the long running times for the best performing registration schemes for the problem (Klein et al., 2009). Current methods state running times of 2–20 h per single registration (Landman and Warfield, 2012). Furthermore, sophisticated fusion schemes can also be computationally expensive. State of the art approaches report fusion running times of 3–5 h (Wang et al., 2012, Asman and Landman, 2012a, Asman and Landman, 2012b).

While the major drawback of high computational costs is the scalability to large and growing databases, they also limit the amount of possible experimentation during the algorithm development phase.

Our method differs from previous MALP approaches in the way how label proposals for a single atlas are generated, and is designed with the goal of low computational cost at test time and experimentation. In this work, we focus on the question of how a single atlas is encoded. From this point of view, methods assuming one-to-one correspondence represent an atlas directly as an image/label-map pair, while patch-based methods encode it by a set of localized patch collections. Variations of the patch-based encoding include use of sparsity (Wu et al., 2012), or use of label-specific kNN search structures (Wang et al., 2013).

In contrast to previous representations, we encode a single atlas together with its relation to label priors by a small and deep classification forest – which we call an Atlas Forest (AF). Given a target image as input (and an aligned probabilistic atlas), each AF returns a probabilistic label estimate for the target. Label fusion is then performed by averaging the probability estimates obtained from different AFs. Please see Fig. 1 for an overview of our method. While patch-based methods use a static representation for each image point (i.e. a patch of fixed size), our encoding is spatially varying. In the training step, our approach learns to describe different image points by differently shaped features, depending on the point’s contextual appearance.

Compared to current MALP methods, our approach has the following important characteristics:

  • 1.

    Only one registration per target is required. This registration aligns the probabilistic atlas to the target. Since only one registration per target is required, the running time is independent of the database size in this respect. This differs conceptually from patch-based approaches, where the efficiency does not come from reducing the number of registrations, but from using affine instead of non-linear transformations.

  • 2.

    Efficient generation of atlas proposals and their fusion. For proposal generation one AF per atlas is evaluated. Due to the inherent efficiency of tree-based classifiers at test time, this is significantly more efficient than current approaches.

  • 3.

    Efficient Experimentation. A leave-one-out cross-validation of a standard MALP approach on n atlases requires registration between all images, thus scaling with n2. In contrast, the training of the single AFs, which is the most costly component of our approach for experimentation, scales with n (this assumes that generating the probabilistic atlas is not part of experimentation).

Besides being efficient, experiments on 4 databases in Section 3 indicate that our scheme also achieves accuracy within the range of the state of the art.

Being based on discriminative classifiers, our approach is also related to a number of works which employ machine learning techniques. Compared to the use of multi-atlas label propagation techniques discussed above, the use of machine learning for brain labeling is still relatively limited. In (Tu et al., 2008), a hybrid model is proposed, which combines a discriminative probabilistic-boosting tree (PBT) classifier (Tu, 2005) with a PCA-based generative shape model of the individual anatomical structures. In (Tu and Bai, 2010), the Auto-Context framework with the PBT classifier was applied to brain labeling, and shown to outperform (Tu et al., 2008). Recently, the use of classifiers to correct systematic mistakes of labeling methods in a post-processing step has been shown to improve accuracy (Wang et al., 2011, Wang et al., 2012).

The major difference of these works to our approach is that they use the common scheme in which all available atlases are used for the training of one classifier. This is also true of standard forest schemes (cf. e.g. (Shotton et al., 2011, Iglesias et al., 2011a, Montillo et al., 2011, Zikic et al., 2012)) which train each tree on data from all training images.

In contrast, the main idea of this paper is to use one classifier to encode a single atlas by training it only on this exemplar. This approach has three advantageous properties for the multi-atlas label propagation setting.

  • 1.

    Simple incorporation of new atlases into the database. For standard forest schemes, addition of new training data requires complete retraining or approximations. In our scenario, a new forest is simply trained on the new atlas exemplar and added to the other, previously trained AFs.

  • 2.

    Selection of atlases for target-specific evaluation is straightforward since every AF is associated with a single atlas. This property allows use of atlas-selection (Aljabar et al., 2009), which can improve accuracy and reduce the computational cost. This step seems non-obvious for standard forest schemes where predictions are not separable with respect to specific atlases.

  • 3.

    Efficient experimentation. For cross-validation, standard schemes have to be trained for every training/testing split of data, which is extremely costly. In our scenario, each AF is trained only once. Any leave-k-out test is performed simply by using the subset of n-k AFs corresponding to the training data. This point can be seen as a generalization of the corresponding experimentation efficiency property in the MALP setting.

In general, training ensemble classifiers on disjunct subsets of data cannot be expected to reach higher accuracy than training each classifier on all data or overlapping subsets, especially if the subsets are different atlases. The difference in accuracy between the two models will depend on the application, and especially the similarity of the atlases to each other. Furthermore, in practice, the computational complexity of each model will also limit the possibility to set the parameters of each model, such that it performs as close as possible to its theoretical limit. In Section 3.1.2, we experimentally show that the accuracy of the proposed scheme and a ’reasonable’ standard forest scheme seems to be on approximately the same level for the brain labeling task.

The main idea of thinking about a single atlas as a classifier is already mentioned for example in (Rohlfing et al., 2005). And indeed, the action of a single warped atlas in a standard MALP setting is that of a classifier – however a very simple one: For each spatial point the warped atlas will assign the value from the corresponding warped atlas label map.

In this work, we propose the use of non-trivial machine learning-based classifiers to encode individual atlases in the MALP setting, and demonstrate that this approach exceeds the standard encoding in terms of efficiency, while maintaining high accuracy, but also has the additional advantages in comparison to standard learning schemes, as discussed in detail above.

Our work on atlas forests was originally presented in a form of a conference paper in (Zikic et al., 2013a). This article extends the previous conference publication by providing a new evaluation with a simplified system, and a detailed evaluation and analysis of the method, as well as a hopefully improved overall presentation. To our best knowledge, the only other work which considers the use of non-trivial classifiers which are trained by individual atlases is (Akhondi-Asl and Warfield, 2013). The focus of that work is on a generalization of the STAPLE fusion method (Warfield et al., 2004) to operate on probabilistic estimates rather than thresholded label estimates. To generate per-atlas probabilistic estimates, (Akhondi-Asl and Warfield, 2013) uses a Gaussian Mixture Model (GMM) of patch intensities, and trains an individual GMM for each atlas. This article has a focus on efficiency and the relation of the proposed scheme to existing machine learning schemes. It differs from previous work in technical details through use of a different classifier in combination with probabilistic atlases, and a simple averaging of probabilities as the fusion method. After describing the details of the method in the next section, we evaluate its performance and analyze it in Section 3, and discuss and summarize its properties in Section 4.

Section snippets

Method – Atlas Forests

An atlas forest (AF) encodes a single atlas by training one randomized classification forest (Breiman, 2001) exclusively on the data from the atlas. Every point in the atlas is described by its (contextual) appearance only, without considering its location (this can be seen as an even further relaxation of the one-to-one assumption, compared to patch-based approaches).

While this allows us to avoid registration of atlases to the target image, a problem with such a location-oblivious approach is

Evaluation and analysis

We evaluate our approach on four brain MRI data sets:

  • IBSR database (Section 3.1).

  • LPBA40 database (Section 3.2).

  • MICCAI 2012 Multi-Atlas Labeling Challenge (3.3).

  • MICCAI 2013 SATA Challenge (Section 3.4).

Additionally, we perform an analysis of the influence of the different method components and their variations in Sections 3.1.1 Influence of method components, 3.1.2 Comparison to the standard forest scheme, 3.1.3 Auto-context variation, 3.1.4 Parameter settings, and analyse the structure of the

Discussion and summary

When comparing the proposed method to standard forest schemes, two interesting points arise: relation of our approach to standard bagging strategies, and the issue of over-training.

Bagging is a strategy for diversifying trees through randomization, by selecting a random subset of samples for the training of each tree. Single trees are then non-linear probabilistic approximating functions for a random sample subset, and the forest prediction is their linear combination. This strategy has the

References (35)

  • Asman, A., Akhondi-Asl, A., Wang, H., Tustison, N., Avants, B., Warfield, S.K., Landman, B., 2013. Miccai 2013...
  • L. Breiman

    Random forests

    Machine Learning

    (2001)
  • B. Glocker et al.

    Dense image registration through MRFs and efficient linear programming

    MedIA

    (2008)
  • Iglesias, J.E., Konukoglu, E., Montillo, A., Tu, Z., Criminisi, A., 2011a. Combining generative and discriminative...
  • J.E. Iglesias et al.

    Robust brain extraction across datasets and comparison with publicly available methods

    IEEE TMI

    (2011)
  • A. Klein et al.

    101 Labeled brain images and a consistent human cortical labeling protocol

    Front. Brain Imag. Methods

    (2012)
  • Cited by (83)

    • Multiatlas segmentation

      2019, Handbook of Medical Image Computing and Computer Assisted Intervention
    • Automatic brain labeling via multi-atlas guided fully convolutional networks

      2019, Medical Image Analysis
      Citation Excerpt :

      The main drawback of such methods is that the labeling performance highly depends on the reliability of non-rigid registration techniques used, which is often quite time-consuming (Iglesias and Sabuncu, 2015). Patch-based methods, on the other hand, have gained increased attention in image labeling, since they can alleviate the need for high registration accuracy through exploring several neighboring patches within a local search region (Tu and Bai, 2010; Hao et al., 2014; Zikic et al., 2014; Khalifa et al., 2016; Pereira et al., 2016, Zhang et al., 2017). For such methods, affine registration of the atlases to the target image is often used.

    • Factorisation-Based Image Labelling

      2022, Frontiers in Neuroscience
    View all citing articles on Scopus
    View full text