Shape regression machine and efficient segmentation of left ventricle endocardium from 2D B-mode echocardiogram

doi:10.1016/j.media.2010.04.002

Medical Image Analysis

Volume 14, Issue 4, August 2010, Pages 563-581

https://doi.org/10.1016/j.media.2010.04.002 Get rights and content

Abstract

We present a machine learning approach called shape regression machine (SRM) for efficient segmentation of an anatomic structure that exhibits a deformable shape in a medical image, e.g., left ventricle endocardial wall in an echocardiogram. The SRM achieves efficient segmentation via statistical learning of the interrelations among shape, appearance, and anatomy, which are exemplified by an annotated database. The SRM is a two-stage approach. In the first stage that estimates a rigid shape to solve an automatic initialization problem, it derives a regression solution to object detection that needs just one scan in principle and a sparse set of scans in practice, avoiding the exhaustive scanning required by the state-of-the-art classification-based detection approach while yielding comparable detection accuracy. In the second stage that estimates the nonrigid shape, it again learns a nonlinear regressor to directly associate nonrigid shape with image appearance. The underpinning of both stages is a novel image-based boosting ridge regression (IBRR) method that enables multivariate, nonlinear modeling and accommodates fast evaluation. We demonstrate the efficiency and effectiveness of the SRM using experiments on segmenting the left ventricle endocardium from a B-mode echocardiogram of apical four chamber view. The proposed algorithm is able to automatically detect and accurately segment the LV endocardial border in about 120 ms.

Introduction

The left ventricle (LV) endocardium is the innermost layer of tissue that lines the LV, one of the four chambers of the heart. The segmentation of LV endocardium from a medical image is a crucial step toward characterizing heart functionality. For example, in order to derive the most important cardiac measurement – the LV ejection fraction, which measures the fraction of blood pumped out of the LV per heart beat, one has to segment the LV endocardium from an image. In this paper, we study the problem of segmenting the LV endocardium from a 2D B-mode echocardiogram, which is a 2D image slice of the heart acquired by an ultrasonic imaging device. In particular, we focus on the canonical view of apical four chamber (A4C) acquired using the transthoracic transducer. An A4C echocardiogram contains all four heart chambers, namely left ventricle (LV), right ventricle (RV), left atrium (LA), and right atrium (RA). Refer to Fig. 1 for an anatomical illustration¹ and several example images of A4C echocardiography.

Because the LV endocardium in an echocardiogram manifests a deformable shape, localizing it can be solved as deformable shape segmentation, a long-standing challenge in medical image analysis. Numerous good algorithms have been proposed in the literature to tackle this general problem, among which there are three representative approaches: the active contour/snake (Kass et al., 1988), the Mumford–Shah variational method (Mumford and Shah, 1989), and the active shape models (ASM) (Cootes et al., 1995)/active appearance models (AAM) (Cootes et al., 2001).

The active contour/snake approach (Kass et al., 1988) seeks a parameterized curve C(s) that minimizes the cost function $E_{snake} (C)$ : $E_{snake} (C) = \int_{0}^{1} \{- μ | \nabla I (C (s)) |^{2} + w_{1} (s) | C^{'} (s) |^{2} + w_{2} (s) | C^{″} (s) |^{2}\} ds,$ where μ controls the magnitude of the potential, ∇ is the gradient operator, I is the image, w₁(s) controls the tension of the curve, and w₂(s) controls the rigidity of the curve. The implicit assumption of the snake model is that edge defines the curve due to the use of the gradient operator.

In the Mumford–Shah variational method (Mumford and Shah, 1989), the minimal partition problem is mostly studied, where a curve C is sought to minimize the cost function $E_{ms} (C)$ : $E_{ms} (C) = \int_{Ω_{i}} | I (x, y) - u_{i} |^{2} dxdy + \int_{Ω_{o}} | I (x, y) - u_{o} |^{2} dxdy + μ L (C),$ where Ω_i and Ω_o denote the inside and outside regions, respectively, with respect to the curve C,u_i and u_o are piecewise constants for the two regions, and $L (C)$ is the length of the curve. The region homogeneity is implicitly assumed here. The curve C can be described explicitly or implicitly. The mostly used implicit representation is the so-called level set function (Chan and Vese, 2001).

The ASM (Cootes et al., 1995) characterizes the shape C using a linear generative model: $C = \bar{C} + Q_{c} a,$ where $\bar{C}$ is the mean shape and a is the blending coefficient vector. The optimal parameter C is obtained by iterating two steps: (i) searching around each point on C for a better position and (ii) updating the model parameter a to best describe these newly found positions. When searching a better position for each point, the criteria such as strong edges or a match to an expected statistical model are used. The AAM (Cootes et al., 2001) jointly characterizes the appearance I and shape C using a linear generative model: $C = \bar{C} + Q_{c} a; I = \bar{I} + Q_{i} a,$ where $\bar{I}$ is the mean appearance in a normalized (or so-called shape-free) patch and a the blending coefficient vector shared by both the shape and appearance. The model parameters a, along with the similarity transformation parameters t applied to the shape and the scaling/offset variables u applied to the appearance, are found by fitting the AAM to the observed image using a mean square error criterion and an analysis-by-synthesis approach. In particular, the displacement δp in the parameters of interest p = [a^T,t^T,u^T]^T is predicted via a linear form arising from the first order Taylor expansion: $δ p = R (p) δ I (p),$ where R(p) is the Jacobian matrix and δI(p) is the difference between the shape-free image patch extracted from the input image and the synthesized image parameterized by p. This can also be viewed as performing a linear regression to some extent.

However, directly applying the above three approaches without modification to the problem of LV endocardium segmentation from an A4C echocardiogram does not yield satisfactory performance. Fig. 1b–d presents several A4C examples that manifest the following challenges:

•
The LV endocardium is not defined by the edge, e.g., it cuts through the papillary muscle attached to the endocardial wall. This makes the active contour/snake approach (Kass et al., 1988) ineffective.
•
The region homogeneity, which is severely violated due to ultrasound imaging artifacts and signal dropouts, poses difficulty to the Mumford–Shah variational method (Mumford and Shah, 1989).
•
The variations in the shape and appearance of LV endocardium cannot be sufficiently characterized by linear models due to differences in instrument, patient, and sonograher, respiratory interferences, unnecessary probe movements, etc. We empirically find that, for the LV endocardial border, the linear shape model is quite effective in practice; but the ASM (Cootes et al., 1995) lacks a good matching criteria needed for searching a better position for each point. The linear appearance model in the AAM (Cootes et al., 2001), though quite useful in many occasions, is suboptimal. Nonlinear modeling is more appropriate.

Furthermore, the above methods need good initialization, which requires solving a detection problem; otherwise, they are prone to being trapped by local minima. All these challenges motivate alternative solutions in the literature to address the problem of LV endocardium segmentation. In Section 2.1, we address various approaches proposed in the literature to perform the segmentation of LV endocardium from 2D echocardiograms (Klingler et al., 1988, Binder et al., 1999, Mishra et al., 2003, Mignotte and Meunier, 2001, Yan and Zhuang, 2003, Lin et al., 2003, Chen et al., 2003, Bosch et al., 2002, Mignotte et al., 2001, Boukerroui et al., 2003, Georgescu et al., 2005). Most of them are based on image processing methods (Klingler et al., 1988) such as morphological operations and pixel classification (Binder et al., 1999), or variants of active contour (Mishra et al., 2003, Mignotte and Meunier, 2001), variational methods/level set (Yan and Zhuang, 2003, Lin et al., 2003, Chen et al., 2003), extension of the AAMs (Bosch et al., 2002), and deformable template (Mignotte et al., 2001), etc.

In this paper, we present a novel machine learning approach called shape regression machine (SRM) (Zhou and Comaniciu, 2007, Zhou and Zhou, 2007). It enables efficient segmentation of an anatomical structure because it aims to effectively leverage the underlying anatomical image context manifested in medical images. The anatomical image context is defined as the interrelations among shape, appearance, and anatomy and is best exemplified by an annotated database of sufficient size. Because linear models have difficulty in modeling the complicated anatomical image context, we derive nonlinear models based on machine learning for discriminative characterization. In particular, we utilize regression to extract knowledge from the annotated database. After learning the regression functions, the database is no longer needed in the memory. The SRM approach is able to deal with deformable contour not necessarily supported by the edge and region homogeneity. In addition, it is fully automatic, requiring no manual interaction, and runs in less than a second! Section 3 depicts the general principle of the SRM approach and two detailed components of SRM, namely regression-based object detection (Section 3.1) and regression-based deformable shape inference (Section 3.2).

Section 4 elaborates an image-based boosting ridge regression (IBRR) method that underpins the SRM. The IBRR is an enhanced version of the image-based boosting regression earlier proposed in Zhou et al. (2005). Recently, image-based regression becomes an emerging challenge in the vision literature. Section 2.2 presents a review of this literature.

In Section 5, we present the experimental results of segmenting the LV endocardium from the A4C echocardiogram to demonstrate the effectiveness and efficiency of the SRM approach when compared with various state-of-the-art approaches. Section 6 presents two discussion topics and Section 7 concludes the paper.

Section snippets

LV endocardium segmentation from 2D B-mode echocardiogram

In this section, we give a brief review of algorithms specialized in segmenting the LV endocardial wall from 2D B-mode echocardiogram. Many of the below-reviewed approaches are based on image processing methods such as morphological operations and pixel classification, or variants of active contour, variational methods/level set, and deformable template, etc. Also, only a few dealt with echocardiograms of apical view (or long-axis view) with the majority focusing on parasternal short-axis view,

Shape regression machine

In this paper, the shape C is represented by two parts: rigid and deformable. The rigid transformation is parameterized by θ and the deformable part by S. If a similarity transformation is used as θ, then the above shape representation reduces to Kendall’s interpretation (Kendall et al., 1999). To rigidly align the LV shape in the A4C echocardiogram more accurately, we use a 5D-parameterization $θ = (t_{x}, t_{y}, \log (s_{x}), \log (s_{y}), α),$ with (t_x, t_y) for translation, α for orientation, and (s_x, s_y) for scale (or

Image-based boosting ridge regression (IBRR)

The underpinning of the above two SRM stages is a regression procedure that takes an image patch as input and outputs a multidimensional variable. In this section, we invoke the influential boosting framework (Freund and Schapire, 1997, Friedman et al., 2000) to derive a novel regression algorithm called image-based boosting ridge regression (IBRR). Below in this section, we assume that input is $x \in R^{p}$ and output is $y (x) \in R^{q}$ and there are N training data points ${(x_{n}, y (x_{n}))}_{n = 1}^{N}$ available. Note

Experimental results

We applied the SRM approach to segmenting the LV endocardium from 2D echocardiograms. Below, we first elaborate in Section 5.1 the details of data collection and annotation. Then we include the experimental results of the regression-based detection approach in Sections 5.2 2D translation detection, 5.3 Rigid object detection. Finally, we demonstrate in Section 5.4 the effectiveness and efficiency of IBRR when applied to deformable shape inference, with comparison with other regression methods.

Classification-based vs. regression-based object detection

A successful object detection approach based on machine learning must harness the learning complexity in its off-line learning and the computational complexity in its online inference from a test image.

•
Learning complexity. In the classification-based approach, the main challenge lies in handling the number of negatives – anything other than positive is negative, apart from the large image appearance variations in positives and negatives. In theory, one image contributes one positive (assuming

Conclusion

We have presented a machine learning approach called shape regression machine (SRM) for fast detection and segmentation of anatomical structure. The SRM effectively utilizes the anatomical context in medical images to eliminate unfavorable restrictions posed by conventional deformable shape segmentation methods. In particular, the detection solution in the SRM replaces the exhaustive scanning of the query image required by the classification-based detector by sparse scanning and reaches

References (49)

T. Binder et al.
Artificial neural networks and spatial temporal contour linking for automated endocardial contour detection on echocardiograms: A novel approach to determine left ventricular contractile function
Ultrasound Med. Biol.
(1999)
D. Boukerroui et al.
Segmentation of ultrasound images multiresolution 2-d and 3-d algorithm based on global and local statistics
Pattern Recogn. Lett.
(2003)
T.F. Cootes et al.
Active shape models—their training and application
Comput. Vision Image Understand.
(1995)
Y. Freund et al.
A decision-theoretic generalization of on-line learning and an application to boosting
J. Comput. Syst. Sci.
(1997)
N. Lin et al.
Combinative multi-scale level set framework for echocardiographic image segmentation
Med. Image Anal.
(2003)
J. Mazziotta et al.
A probabilistic atlas of the human brain: theory and rational for its development
Neuroimage
(1995)
M. Mignotte et al.
A multiscale optimization approach for the dynamic contour-based boundary detection issue
Comput. Med. Imag. Graph.
(2001)
A. Mishra et al.
A GA based approach for boundary detection of left ventricle with echocardiographic image sequences
Image Vision Comput.
(2003)
B. Vemuri et al.
Image registration via level-set motion: applications to atlas-based segmentation
Med. Image Anal.
(2003)
J.Y. Yan et al.
Applying improved fast marching method to endocardial boundary detection in echocardiographic images
Pattern Recognit. Lett.
(2003)

N. Adreasenm et al.

Automatic atlas-based volume estimation of human brain regions from mr images

J. Comput. Assist. Tomo.

(1996)

Agarwal, A., Triggs, B., 2004. 3D human pose from silhouette by revelance vector regression. In: Proceedings of the...

Bissacco, A., Yang, M., Soatto, S., 2007. Fast human pose estimation using appearance and motion via multi-dimensional...

J.G. Bosch et al.

Automatic segmentation of echocardiographic sequences by active appearance motion models

IEEE Trans. Med. Imag.

(2002)

G. Carneiro et al.

Detection and measurement of fetal anatomies from ultrasound images using a constrained probabilistic boosting tree

IEEE Trans. Med. Imag.

(2008)

T. Chan et al.

Active contours without edges

IEEE Trans. Image Process.

(2001)

Chen, Y., Huang, F., Tagare, H.D., Rao, M., Wilson, D., Geiser, E.A., 2003. Using prior shape and intensity profile in...

Cootes, T., Beeston, C., Edwards, G., Taylor, C., 1999. A unified framework for atlas matching using active appearance...

T.F. Cootes et al.

Active appearance models

IEEE Trans. Pattern Anal. Machine Intell.

(2001)

H. Copas

Regression, prediction, and shrinkage

J. R. Stat. Soc. B

(1983)

Covell, M., 1996. Eigen-points: control-point location using principal component analysis. In: International Conference...

Cristinacce, D., Cootes, T., 2007. Boosted regression active shape models. In: Proceedings of the British Machine...

N. Duffy et al.

Boosting methods for regression

Machine Learn.

(2002)

J. Friedman

Greedy function approximation: a gradient boosting machine

Ann. Stat.

(2001)

Cited by (46)

Deep Network Design for Medical Image Computing: Principles and Applications
2023, Deep Network Design for Medical Image Computing: Principles and Applications
Deep reinforcement learning in medical imaging: A literature review
2021, Medical Image Analysis
Deep reinforcement learning (DRL) augments the reinforcement learning framework, which learns a sequence of actions that maximizes the expected reward, with the representative power of deep neural networks. Recent works have demonstrated the great potential of DRL in medicine and healthcare. This paper presents a literature review of DRL in medical imaging. We start with a comprehensive tutorial of DRL, including the latest model-free and model-based algorithms. We then cover existing DRL applications for medical imaging, which are roughly divided into three main categories: (i) parametric medical image analysis tasks including landmark detection, object/lesion detection, registration, and view plane localization; (ii) solving optimization tasks including hyperparameter tuning, selecting augmentation strategies, and neural architecture search; and (iii) miscellaneous applications including surgical gesture segmentation, personalized mobile health intervention, and computational model personalization. The paper concludes with discussions of future perspectives.
Landmark detection and multiorgan segmentation: Representations and supervised approaches
2019, Handbook of Medical Image Computing and Computer Assisted Intervention
In this chapter we present discriminative learning approaches for landmark detection and shape segmentation. Specifically, we elaborate different landmark representations and demonstrate how to use them in different supervised learning methods. We then present various shape representations and a learning approach that fuses regression, which models global context, and classification, which models local context, for rapid multiple organ segmentation.
A modified level set algorithm based on point distance shape constraint for lesion and organ segmentation
2019, Physica Medica
The segmentation of organs and lesions from medical images is a challenging task due to the presents of noise, intensity inhomogeneity, blurry/weak boundaries. In this paper, a point distance shape constraint is proposed and incorporated in the level set framework for the segmentation of objects with various shapes.
The proposed shape constraint is a linear combination of the Euclidean distance of user selected points. By selecting different numbers of points, it can generate different shape constraints and therefore is more flexible in dealing with different shapes. Then this shape constraint is incorporated into the variational level set framework. A convex relaxation is applied to get a convex model which can be efficiently solved by a primal-dual hybrid gradient algorithm.
The proposed algorithm is tested on 60 CT images with the nodular type of hepatic cellular cancer (HCC), 100 ultrasound kidney images, 20 prostate MR images, 20 lumbar CT images and 100 transrectal ultrasound prostate images. The algorithms performance is evaluated using a number of metrics by comparison with expert delineations. The validation results show that, for five datasets mentioned previously, the average DSCs of the proposed algorithm are 95.6% ± 1.4%, 94.3% ± 3.1%, 91.3% ± 3.8%, 92.7% ± 1.5% and 94.4% ± 2.2% respectively. Both quantitative and qualitative evaluation confirm that the proposed method can provide more accurate segmentation than four state-of-the-art methods.
The proposed point distance shape constraint segmentation model can accurately segment organs and lesions with a number of shapes in medical images.
Convolutional neural network regression for short-axis left ventricle segmentation in cardiac cine MR sequences
2017, Medical Image Analysis
Citation Excerpt :
Regression based segmentation approaches have been previously proposed, though to date not yet paired with CNNs. Further, most of these algorithms apply regression to the entire shape, typically through an intermediate dimension-reduction procedure such as principal component analysis on the shape parameters (Lay et al., 2013; Shao et al., 2015; Zhou, 2010). In contrast, our approach parameterizes and regresses LV radial distances point-by-point on a polar coordinate system; this retains the flexibility of the learned network while only incorporating two fundamental constraints: the isocentric relationship of the endo- and epicardium walls, as well as the LV being generally convex in shape.
Automated left ventricular (LV) segmentation is crucial for efficient quantification of cardiac function and morphology to aid subsequent management of cardiac pathologies. In this paper, we parameterize the complete (all short axis slices and phases) LV segmentation task in terms of the radial distances between the LV centerpoint and the endo- and epicardial contours in polar space. We then utilize convolutional neural network regression to infer these parameters. Utilizing parameter regression, as opposed to conventional pixel classification, allows the network to inherently reflect domain-specific physical constraints.
We have benchmarked our approach primarily against the publicly-available left ventricle segmentation challenge (LVSC) dataset, which consists of 100 training and 100 validation cardiac MRI cases representing a heterogeneous mix of cardiac pathologies and imaging parameters across multiple centers. Our approach attained a .77 Jaccard index, which is the highest published overall result in comparison to other automated algorithms. To test general applicability, we also evaluated against the Kaggle Second Annual Data Science Bowl, where the evaluation metric was the indirect clinical measures of LV volume rather than direct myocardial contours. Our approach attained a Continuous Ranked Probability Score (CRPS) of .0124, which would have ranked tenth in the original challenge. With this we demonstrate the effectiveness of convolutional neural network regression paired with domain-specific features in clinical segmentation.
Direct and simultaneous estimation of cardiac four chamber volumes by multioutput sparse regression
2017, Medical Image Analysis
Citation Excerpt :
It also provides an effective tool to automate analysis of medical imaging data and therefore enables accurate and efficient diagnosis in clinical practice (Wang and Summers, 2012). We review representative work based on regression for medical image analysis including shape inference (Zhou, 2010), organ localization (Criminisi et al., 2013), model personalization (Zettinig et al., 2014), cell detection (Kainz et al., 2015) and cardiac ventricular volume estimation (Afshin et al., 2012; Wang et al., 2014a; Zhen et al., 2014b). Zhou (2010) proposed a machine learning approach called shape regression machine (SRM) for efficient shape reference of an anatomical structure by boosting based regression without explicitly conducting segmentation.
Cardiac four-chamber volume estimation serves as a fundamental and crucial role in clinical quantitative analysis of whole heart functions. It is a challenging task due to the huge complexity of the four chambers including great appearance variations, huge shape deformation and interference between chambers. Direct estimation has recently emerged as an effective and convenient tool for cardiac ventricular volume estimation. However, existing direct estimation methods were specifically developed for one single ventricle, i.e., left ventricle (LV), or bi-ventricles; they can not be directly used for four chamber volume estimation due to the great combinatorial variability and highly complex anatomical interdependency of the four chambers.
In this paper, we propose a new, general framework for direct and simultaneous four chamber volume estimation. We have addressed two key issues, i.e., cardiac image representation and simultaneous four chamber volume estimation, which enables accurate and efficient four-chamber volume estimation. We generate compact and discriminative image representations by supervised descriptor learning (SDL) which can remove irrelevant information and extract discriminative features. We propose direct and simultaneous four-chamber volume estimation by the multioutput sparse latent regression (MSLR), which enables jointly modeling nonlinear input-output relationships and capturing four-chamber interdependence. The proposed method is highly generalized, independent of imaging modalities, which provides a general regression framework that can be extensively used for clinical data prediction to achieve automated diagnosis. Experiments on both MR and CT images show that our method achieves high performance with a correlation coefficient of up to 0.921 with ground truth obtained manually by human experts, which is clinically significant and enables more accurate, convenient and comprehensive assessment of cardiac functions.

View all citing articles on Scopus

View full text

Shape regression machine and efficient segmentation of left ventricle endocardium from 2D B-mode echocardiogram

Abstract

Introduction

Section snippets

LV endocardium segmentation from 2D B-mode echocardiogram

Shape regression machine

Image-based boosting ridge regression (IBRR)

Experimental results

Classification-based vs. regression-based object detection

Conclusion

Ultrasound Med. Biol.

Pattern Recogn. Lett.

Comput. Vision Image Understand.

J. Comput. Syst. Sci.

Med. Image Anal.

Neuroimage

Comput. Med. Imag. Graph.

Image Vision Comput.

Med. Image Anal.

Pattern Recognit. Lett.

Automatic atlas-based volume estimation of human brain regions from mr images

J. Comput. Assist. Tomo.

Automatic segmentation of echocardiographic sequences by active appearance motion models

IEEE Trans. Med. Imag.

Detection and measurement of fetal anatomies from ultrasound images using a constrained probabilistic boosting tree

IEEE Trans. Med. Imag.

Active contours without edges

IEEE Trans. Image Process.

Active appearance models

IEEE Trans. Pattern Anal. Machine Intell.

Regression, prediction, and shrinkage

J. R. Stat. Soc. B

Boosting methods for regression

Machine Learn.

Greedy function approximation: a gradient boosting machine

Ann. Stat.