Shape regression machine and efficient segmentation of left ventricle endocardium from 2D B-mode echocardiogram
Introduction
The left ventricle (LV) endocardium is the innermost layer of tissue that lines the LV, one of the four chambers of the heart. The segmentation of LV endocardium from a medical image is a crucial step toward characterizing heart functionality. For example, in order to derive the most important cardiac measurement – the LV ejection fraction, which measures the fraction of blood pumped out of the LV per heart beat, one has to segment the LV endocardium from an image. In this paper, we study the problem of segmenting the LV endocardium from a 2D B-mode echocardiogram, which is a 2D image slice of the heart acquired by an ultrasonic imaging device. In particular, we focus on the canonical view of apical four chamber (A4C) acquired using the transthoracic transducer. An A4C echocardiogram contains all four heart chambers, namely left ventricle (LV), right ventricle (RV), left atrium (LA), and right atrium (RA). Refer to Fig. 1 for an anatomical illustration1 and several example images of A4C echocardiography.
Because the LV endocardium in an echocardiogram manifests a deformable shape, localizing it can be solved as deformable shape segmentation, a long-standing challenge in medical image analysis. Numerous good algorithms have been proposed in the literature to tackle this general problem, among which there are three representative approaches: the active contour/snake (Kass et al., 1988), the Mumford–Shah variational method (Mumford and Shah, 1989), and the active shape models (ASM) (Cootes et al., 1995)/active appearance models (AAM) (Cootes et al., 2001).
The active contour/snake approach (Kass et al., 1988) seeks a parameterized curve C(s) that minimizes the cost function :where μ controls the magnitude of the potential, ∇ is the gradient operator, I is the image, w1(s) controls the tension of the curve, and w2(s) controls the rigidity of the curve. The implicit assumption of the snake model is that edge defines the curve due to the use of the gradient operator.
In the Mumford–Shah variational method (Mumford and Shah, 1989), the minimal partition problem is mostly studied, where a curve C is sought to minimize the cost function :where Ωi and Ωo denote the inside and outside regions, respectively, with respect to the curve C,ui and uo are piecewise constants for the two regions, and is the length of the curve. The region homogeneity is implicitly assumed here. The curve C can be described explicitly or implicitly. The mostly used implicit representation is the so-called level set function (Chan and Vese, 2001).
The ASM (Cootes et al., 1995) characterizes the shape C using a linear generative model:where is the mean shape and a is the blending coefficient vector. The optimal parameter C is obtained by iterating two steps: (i) searching around each point on C for a better position and (ii) updating the model parameter a to best describe these newly found positions. When searching a better position for each point, the criteria such as strong edges or a match to an expected statistical model are used. The AAM (Cootes et al., 2001) jointly characterizes the appearance I and shape C using a linear generative model:where is the mean appearance in a normalized (or so-called shape-free) patch and a the blending coefficient vector shared by both the shape and appearance. The model parameters a, along with the similarity transformation parameters t applied to the shape and the scaling/offset variables u applied to the appearance, are found by fitting the AAM to the observed image using a mean square error criterion and an analysis-by-synthesis approach. In particular, the displacement δp in the parameters of interest p = [aT,tT,uT]T is predicted via a linear form arising from the first order Taylor expansion:where R(p) is the Jacobian matrix and δI(p) is the difference between the shape-free image patch extracted from the input image and the synthesized image parameterized by p. This can also be viewed as performing a linear regression to some extent.
However, directly applying the above three approaches without modification to the problem of LV endocardium segmentation from an A4C echocardiogram does not yield satisfactory performance. Fig. 1b–d presents several A4C examples that manifest the following challenges:
- •
The LV endocardium is not defined by the edge, e.g., it cuts through the papillary muscle attached to the endocardial wall. This makes the active contour/snake approach (Kass et al., 1988) ineffective.
- •
The region homogeneity, which is severely violated due to ultrasound imaging artifacts and signal dropouts, poses difficulty to the Mumford–Shah variational method (Mumford and Shah, 1989).
- •
The variations in the shape and appearance of LV endocardium cannot be sufficiently characterized by linear models due to differences in instrument, patient, and sonograher, respiratory interferences, unnecessary probe movements, etc. We empirically find that, for the LV endocardial border, the linear shape model is quite effective in practice; but the ASM (Cootes et al., 1995) lacks a good matching criteria needed for searching a better position for each point. The linear appearance model in the AAM (Cootes et al., 2001), though quite useful in many occasions, is suboptimal. Nonlinear modeling is more appropriate.
Furthermore, the above methods need good initialization, which requires solving a detection problem; otherwise, they are prone to being trapped by local minima. All these challenges motivate alternative solutions in the literature to address the problem of LV endocardium segmentation. In Section 2.1, we address various approaches proposed in the literature to perform the segmentation of LV endocardium from 2D echocardiograms (Klingler et al., 1988, Binder et al., 1999, Mishra et al., 2003, Mignotte and Meunier, 2001, Yan and Zhuang, 2003, Lin et al., 2003, Chen et al., 2003, Bosch et al., 2002, Mignotte et al., 2001, Boukerroui et al., 2003, Georgescu et al., 2005). Most of them are based on image processing methods (Klingler et al., 1988) such as morphological operations and pixel classification (Binder et al., 1999), or variants of active contour (Mishra et al., 2003, Mignotte and Meunier, 2001), variational methods/level set (Yan and Zhuang, 2003, Lin et al., 2003, Chen et al., 2003), extension of the AAMs (Bosch et al., 2002), and deformable template (Mignotte et al., 2001), etc.
In this paper, we present a novel machine learning approach called shape regression machine (SRM) (Zhou and Comaniciu, 2007, Zhou and Zhou, 2007). It enables efficient segmentation of an anatomical structure because it aims to effectively leverage the underlying anatomical image context manifested in medical images. The anatomical image context is defined as the interrelations among shape, appearance, and anatomy and is best exemplified by an annotated database of sufficient size. Because linear models have difficulty in modeling the complicated anatomical image context, we derive nonlinear models based on machine learning for discriminative characterization. In particular, we utilize regression to extract knowledge from the annotated database. After learning the regression functions, the database is no longer needed in the memory. The SRM approach is able to deal with deformable contour not necessarily supported by the edge and region homogeneity. In addition, it is fully automatic, requiring no manual interaction, and runs in less than a second! Section 3 depicts the general principle of the SRM approach and two detailed components of SRM, namely regression-based object detection (Section 3.1) and regression-based deformable shape inference (Section 3.2).
Section 4 elaborates an image-based boosting ridge regression (IBRR) method that underpins the SRM. The IBRR is an enhanced version of the image-based boosting regression earlier proposed in Zhou et al. (2005). Recently, image-based regression becomes an emerging challenge in the vision literature. Section 2.2 presents a review of this literature.
In Section 5, we present the experimental results of segmenting the LV endocardium from the A4C echocardiogram to demonstrate the effectiveness and efficiency of the SRM approach when compared with various state-of-the-art approaches. Section 6 presents two discussion topics and Section 7 concludes the paper.
Section snippets
LV endocardium segmentation from 2D B-mode echocardiogram
In this section, we give a brief review of algorithms specialized in segmenting the LV endocardial wall from 2D B-mode echocardiogram. Many of the below-reviewed approaches are based on image processing methods such as morphological operations and pixel classification, or variants of active contour, variational methods/level set, and deformable template, etc. Also, only a few dealt with echocardiograms of apical view (or long-axis view) with the majority focusing on parasternal short-axis view,
Shape regression machine
In this paper, the shape C is represented by two parts: rigid and deformable. The rigid transformation is parameterized by θ and the deformable part by S. If a similarity transformation is used as θ, then the above shape representation reduces to Kendall’s interpretation (Kendall et al., 1999). To rigidly align the LV shape in the A4C echocardiogram more accurately, we use a 5D-parameterizationwith (tx, ty) for translation, α for orientation, and (sx, sy) for scale (or
Image-based boosting ridge regression (IBRR)
The underpinning of the above two SRM stages is a regression procedure that takes an image patch as input and outputs a multidimensional variable. In this section, we invoke the influential boosting framework (Freund and Schapire, 1997, Friedman et al., 2000) to derive a novel regression algorithm called image-based boosting ridge regression (IBRR). Below in this section, we assume that input is and output is and there are N training data points available. Note
Experimental results
We applied the SRM approach to segmenting the LV endocardium from 2D echocardiograms. Below, we first elaborate in Section 5.1 the details of data collection and annotation. Then we include the experimental results of the regression-based detection approach in Sections 5.2 2D translation detection, 5.3 Rigid object detection. Finally, we demonstrate in Section 5.4 the effectiveness and efficiency of IBRR when applied to deformable shape inference, with comparison with other regression methods.
Classification-based vs. regression-based object detection
A successful object detection approach based on machine learning must harness the learning complexity in its off-line learning and the computational complexity in its online inference from a test image.
- •
Learning complexity. In the classification-based approach, the main challenge lies in handling the number of negatives – anything other than positive is negative, apart from the large image appearance variations in positives and negatives. In theory, one image contributes one positive (assuming
Conclusion
We have presented a machine learning approach called shape regression machine (SRM) for fast detection and segmentation of anatomical structure. The SRM effectively utilizes the anatomical context in medical images to eliminate unfavorable restrictions posed by conventional deformable shape segmentation methods. In particular, the detection solution in the SRM replaces the exhaustive scanning of the query image required by the classification-based detector by sparse scanning and reaches
References (49)
- et al.
Artificial neural networks and spatial temporal contour linking for automated endocardial contour detection on echocardiograms: A novel approach to determine left ventricular contractile function
Ultrasound Med. Biol.
(1999) - et al.
Segmentation of ultrasound images multiresolution 2-d and 3-d algorithm based on global and local statistics
Pattern Recogn. Lett.
(2003) - et al.
Active shape models—their training and application
Comput. Vision Image Understand.
(1995) - et al.
A decision-theoretic generalization of on-line learning and an application to boosting
J. Comput. Syst. Sci.
(1997) - et al.
Combinative multi-scale level set framework for echocardiographic image segmentation
Med. Image Anal.
(2003) - et al.
A probabilistic atlas of the human brain: theory and rational for its development
Neuroimage
(1995) - et al.
A multiscale optimization approach for the dynamic contour-based boundary detection issue
Comput. Med. Imag. Graph.
(2001) - et al.
A GA based approach for boundary detection of left ventricle with echocardiographic image sequences
Image Vision Comput.
(2003) - et al.
Image registration via level-set motion: applications to atlas-based segmentation
Med. Image Anal.
(2003) - et al.
Applying improved fast marching method to endocardial boundary detection in echocardiographic images
Pattern Recognit. Lett.
(2003)
Automatic atlas-based volume estimation of human brain regions from mr images
J. Comput. Assist. Tomo.
Automatic segmentation of echocardiographic sequences by active appearance motion models
IEEE Trans. Med. Imag.
Detection and measurement of fetal anatomies from ultrasound images using a constrained probabilistic boosting tree
IEEE Trans. Med. Imag.
Active contours without edges
IEEE Trans. Image Process.
Active appearance models
IEEE Trans. Pattern Anal. Machine Intell.
Regression, prediction, and shrinkage
J. R. Stat. Soc. B
Boosting methods for regression
Machine Learn.
Greedy function approximation: a gradient boosting machine
Ann. Stat.
Cited by (46)
Deep Network Design for Medical Image Computing: Principles and Applications
2023, Deep Network Design for Medical Image Computing: Principles and ApplicationsDeep reinforcement learning in medical imaging: A literature review
2021, Medical Image AnalysisLandmark detection and multiorgan segmentation: Representations and supervised approaches
2019, Handbook of Medical Image Computing and Computer Assisted InterventionConvolutional neural network regression for short-axis left ventricle segmentation in cardiac cine MR sequences
2017, Medical Image AnalysisCitation Excerpt :Regression based segmentation approaches have been previously proposed, though to date not yet paired with CNNs. Further, most of these algorithms apply regression to the entire shape, typically through an intermediate dimension-reduction procedure such as principal component analysis on the shape parameters (Lay et al., 2013; Shao et al., 2015; Zhou, 2010). In contrast, our approach parameterizes and regresses LV radial distances point-by-point on a polar coordinate system; this retains the flexibility of the learned network while only incorporating two fundamental constraints: the isocentric relationship of the endo- and epicardium walls, as well as the LV being generally convex in shape.
Direct and simultaneous estimation of cardiac four chamber volumes by multioutput sparse regression
2017, Medical Image AnalysisCitation Excerpt :It also provides an effective tool to automate analysis of medical imaging data and therefore enables accurate and efficient diagnosis in clinical practice (Wang and Summers, 2012). We review representative work based on regression for medical image analysis including shape inference (Zhou, 2010), organ localization (Criminisi et al., 2013), model personalization (Zettinig et al., 2014), cell detection (Kainz et al., 2015) and cardiac ventricular volume estimation (Afshin et al., 2012; Wang et al., 2014a; Zhen et al., 2014b). Zhou (2010) proposed a machine learning approach called shape regression machine (SRM) for efficient shape reference of an anatomical structure by boosting based regression without explicitly conducting segmentation.