A comparison of three total variation based texture extraction models

https://doi.org/10.1016/j.jvcir.2007.01.004Get rights and content

Abstract

This paper qualitatively compares three recently proposed models for signal/image texture extraction based on total variation minimization: the Meyer [27], Vese–Osher (VO) [35], and TV-L1 [12], [38], [2], [3], [4], [29], [30], [31] models. We formulate discrete versions of these models as second-order cone programs (SOCPs) which can be solved efficiently by interior-point methods. Our experiments with these models on 1D oscillating signals and 2D images reveal their differences: the Meyer model tends to extract oscillation patterns in the input, the TV-L1 model performs a strict multiscale decomposition, and the Vese–Osher model has properties falling in between the other two models.

Introduction

Let f be an observed image that contains texture and/or noise. Texture is characterized as repeated and meaningful structure of small patterns. Noise is characterized as uncorrelated random patterns. The rest of an image, which is called cartoon, contains object hues and sharp edges (boundaries). Thus an image f can be decomposed as f = u + v, where u represents image cartoon and v is texture and/or noise. A general way to obtain this decomposition using the variational approach is to solve the problem min{TV(u)| ∥u  fB  σ}, where TV(u) denotes the total variation of u and ∥·∥B is a norm (or semi-norm). The total variation of u, which is defined below in Section 1.1, is minimized to regularize u while keeping edges like object boundaries of f in u (i.e., while allowing discontinuities in u). The fidelity term ∥u  fB  σ forces u to be close to f.

The Banach space BV of functions of bounded variation is important in image processing because such functions are allowed to have discontinuities and hence keep edges. This can be seen from its definition as follows.

Let u  L1, and define [39]TV(u)supudiv(g)dx:gC01(Rn;Rn),|g(x)|1forallxRnas the total variation of u, where |·| denotes the Euclidean norm. Also, u  BV if uBVuL1+TV(u)<. In the above definition, gC01(Rn;Rn), the set of continuously differentiable vector-valued functions, serves as a test set for u. If u is in the Sobolev spaces W1,1 or H1, it follows from integration by parts that TV(u) is equal to ∫|∇u|, where ∇u is the weak derivative of u. However, the use of test functions to define TV(u) allows u to have discontinuities. Therefore, BV is larger than W1,1 and H1. Equipped with the |·|BV-norm, BV is a Banach space.

BV(Ω) with Ω being a bounded open domain is defined analogously to BV with L1 and C01(Rn;Rn) replaced by L1(Ω) and C01(Ω;Rn), respectively.

Next, we define the space G [27]. Let G denote the Banach space consisting of all generalized functions v(x) defined on Rn, which can be written asv=div(g),g=[gi]i=1,,nL(Rn;Rn).

Its norm ∥vG is defined as the infimum of all L norms of the functions |g(x)| over all decompositions (1) of f. In short, vG=inf{(|g(x)|)L:v=div(g)}. G is the dual of the closed subspace BV of BV, where BV{uBV:|f|L1} [27]. We note that finite difference approximations to functions in BV and BV are the same. For the definition and properties of G(Ω), see [5].

An immediate result of the above definitions is thatuv=u·g=-u·gTV(u)vG,holds for any uBV with compact support and v  G. We say (u, v) is an extremal pair if (2) holds with equality.

In image processing, the space BV and the total variation semi-norm were first used by Rudin, Osher, and Fatemi [33] to remove noise from images. Specifically, their model obtains a cleaner image u  BV of a noisy image f by letting u be the minimizer of TV(u)+λu-fL22, in which the regularizing term TV(u) tends to reduce the oscillations in u and the data fidelity term u-fL2 tends to keep u close to f.

The ROF model is the precursor to a large number of image processing models having a similar form. Among the recent total variation-based cartoon-texture decomposition models, Meyer [27] and Haddad and Meyer [20] proposed using the G-norm defined above, Vese and Osher [35] approximated the G-norm by the div(Lp)-norm, Osher, Sole and Vese [32] proposed using the H−1-norm, Lieu and Vese [26] proposed using the more general Hs-norm, and Le and Vese [24] and Garnett, Le, Meyer and Vese [18] proposed using the homogeneous Besov space B˙p,qs, −2 < s < 0, 1  p, q  ∞, extending Meyer’s B˙,-1, to model the oscillation component of an image. In addition, Chan and Esedoglu [12] and Yin, Goldfarb and Osher [38] used the L1-norm together with total variation, following the earlier work by Alliney [2], [3], [4] and Nikolova [29], [30], [31].

In this subsection we present three cartoon-texture decomposition models that are based on the minimization of total variation. We suggest that readers interested in the theoretical analysis of these models read the referenced works mentioned below and in the introduction. Although the analysis of the existence and uniqueness of solutions and duality/conjugacy is not within the scope of our discussion, in Section 3 we relate the differences among the image results from these models to the distinguished properties of the three fidelity terms: f-uL1, ∥f  uG, and its approximation by Vese and Osher.

In the rest of the paper, we assume the input image f has compact support contained in a bounded convex open set Ω. In our tests, Ω is an open square.

In [2], [3], [4], [30], [31], [12], [37] the square of the L2 norm of f  u in the fidelity term in the original ROF model (min{TV(u)+λf-uL22}) is replaced by the L1 norm of f  u, which yields the following problem:Constraintmodel:minuBVΩ|u|,s.t.|f-u|σ,Lagrangianmodel:minuBVΩ|u|+λ|f-u|.

The above constrained minimization problem (3) is equivalent to its Lagrangian relaxed form (4), where λ is the Lagrange multiplier of the constraint ∫|f  u|. The two problems have the same solution if λ is chosen equal to the optimal value of the dual variable corresponding to the constraint in the constrained problem. Given σ or λ, we can calculate the other value by solving the corresponding problem. The same result also holds for Meyer’s model below.

We chose to solve the Lagrangian relaxed version (4), rather than the constraint version (3), in our numerical experiments because several researchers [12], [37] have established the relationship between λ and the scale of f  u. For example, for the unit disk signal 1B(0,r) centered at origin and with radius r, f  u = 1B(0,r) for 0 < λ < 2/r while f  u vanishes for λ > 2/r. Although this model appears to be simpler than Meyer’s model and the Vese–Osher model below, it has recently been shown to have very interesting properties like morphological invariance and texture extraction by scale [12], [37]. These properties are important in various applications in biomedical engineering and computer vision such as background correction [36], face recognition [14], [15], and brain MR image registration [13]. In Section 3, we demonstrate the ability of the TV-L1 model to separate out features of a certain scale in an image.

In addition to the SOCP approach that we use in the paper to solve (4) numerically, the graph-based approaches [17], [11] were recently shown to be very efficient in solving an approximate version of (4).

To extract cartoon u in the space BV and texture and/or noise v as an oscillating function, Meyer [27] proposed the following model:Constraintversion:infuBV|u|,s.t.f-uGσ,Lagrangianversion:infuBV|u|+λf-uG.

As we have pointed out in Section 1.1, G is the dual space of BV, a subspace of BV. So G is closely connected to BV. Meyer gave a few examples in [27], including the one shown at the end of next paragraph, illustrating the appropriateness of modeling oscillating patterns by functions in G.

Unfortunately, it is not possible to write down Euler–Lagrange equations for the Lagrangian form of Meyer’s model (6), and hence, use a straightforward partial differential equation method to solve it. Alternatively, several models [5], [6], [32], [35] have been proposed to solve (6) approximately. The Vese–Osher model [35] described in the next subsection approximates (|g(x)|)L by (|g(x)|)Lp, with 1  p < ∞. The Osher–Sole–Vese model [32] replaces ∥vG by the Hilbert functional vH-12. The more recent A2BC model [5], [7], [6] is inspired by Chambolle’s projection algorithm [10] and minimizes TV(u)+λf-u-vL22 for (u,v)  BV × {v  G:∥vG  μ}. Similar projection algorithms proposed in [9], [8] are also used to approximately solve (4) and (6). Recently, Kindermann and Osher [21] showed that (6) is equivalent to a minimax problem and proposed a numerical method to solve this saddle-point problem. Other numerical approaches based on the dual representation of the G-norm are introduced in [16] by Chung, Le, Lieu, Tanushev, and Vese, [25] by Lieu, and [23] by Le, Lieu, and Vese. In [34], Starck, Elad, and Donoho use sparse basis pursuit to achieve a similar decomposition. In Section 2, we present SOCP-based optimization models to solve both (5) and (6) exactly (i.e., without any approximation or regularization applied to the non-smooth terms ∫|∇u| and ∥vG except for the use of finite differences). In contrast to our choice for the TV-L1 model, we chose to solve (5) with specified σ’s in our numerical experiments because setting an upper bound on ∥f  uG is more meaningful than penalizing ∥f  uG. The following example demonstrates that ∥vG is inversely proportional to the oscillation of v: let v(t) = cos(xt), which has stronger oscillations for larger t; one can show ∥vG = 1/t because cos(xt)=d(1tsin(xt))dx and 1tsin(xt)L=1/t. Therefore, to separate a signal with oscillations stronger than a specific level from f, it is more straightforward to solve the constrained problem (5).

To calculate the G-norm of a function f alone, one can choose to solve an SOCP or use the dual method by Kindermann, Osher and Xu [22]. The authors of the latter work exploit (2) to develop a level-set based iterative method.

Motivated by the definition of the L norm of |g(x)| as the limit(|g|)L=limp(|g|)Lp,

Vese and Osher [35] proposed the following approximation to Meyer’s model (5):infuBV,gC01(Rn;Rn)VOp(u,g)|u|+λ|f-u-div(g)|2+μ[|g|p]1/p,where p  1.

In R2, minimizing VOp with respect to u,g=(g1,g2) yields the associated Euler–Lagrange equations:u=f-1g1-2g2+12λdivu|u|,μg12+g22Lp1-pg12+g22p-2g1=2λ[1(u-f)+112g1+122g2],μg12+g22Lp1-pg12+g22p-2g2=2λ[2(u-f)+122g1+222g2.

In [35], the authors solve the above system of partial differential equations for different values of p, with 1  p  10, via a sequential descent approach and claim that they give very similar numerical results.

The VO model (8) can be viewed as a relaxation of Meyer’s model (5) since the requirement f  u = div(g) is relaxed by penalizing its violation and supg{(|g|)L:(|g|)Lpσ}=. This point is clearly illustrated by the numerical comparisons between these two models presented in Section 3.

The purpose of this paper is to accurately compute and compare the three TV-based models presented above using a uniform approach. To do this we formulate and solve all of the above three models as second-order cone programs (SOCPs). These formulations do not require the use of regularization to handle the non-smoothness of these models. In this subsection, we give a short introduction to SOCP and the use of interior-point methods to solve SOCPs.

In an SOCP the vector of variables xRn is composed of subvectors xiRni – i.e., x  (x1;x2;  ;xr) – where n = n1 + n2 +  + nr and each subvector xi must lie either in an elementary second-order cone of dimension niKni{xi=(xi0;x¯i)R×Rni-1||x¯i|xi0},or an ni-dimensional rotated second-order coneQni{xiRni|xi=x¯,2x¯1x¯2i=3nix¯i2,x¯1,x¯20}.

Note Qni is an elementary second-order cone under a linear transformation; i.e.,12(x1+x2);12(x1-x2);x3;;xniKni(x1;x2;x3;;xt)Qni.

With these definitions an SOCP can be written in the following form [1]:minc1x1++crxrs.t.A1x1++Arxr=bxiKniorQni,fori=1,,r,where ciRni and AiRm×ni, for i = 1,  , r and bRm.

Since a one-dimensional second-order cone corresponds to a semi-infinite ray, SOCPs can accommodate nonnegative variables. In fact if all cones are one-dimensional, then the above SOCP is just a standard form linear program. As is the case for linear programs, SOCPs can be solved in polynomial time by interior point methods. This is the approach that we take to solve the TV-based cartoon-texture decomposition models in this paper.

Over the past two decades there has been extensive research on and development of the interior-point methods for solving linear programs. In the last few years this research and development has been extended to SOCPs. Consequently, these problems can now be solved efficiently both in practice and in theory (in polynomial time). Moreover, interior-point SOCP methods often yield highly accurate solutions. The optimality conditions for the SOCP (13) areA1x1++Arxr=b,Aiy+si=ci,fori=1,,r,xisi=0,fori=1,,r,xi0si¯+si0xi¯=0,fori=1,,r.

Interior-point methods for SOCPs approximately solve a sequence of perturbed optimality conditions (the “0” on the right-hand side of the third block in (14), which is equal to the duality gap, is replaced by a positive scalar μ) by taking single damped Newton steps (Δx1,…,ΔxrΔys1,…,Δsr) while making sure that the new iterates remain interior (i.e., xi + Δxi and si + Δsi are strictly inside their respective cones). The iteration stops once a new iterate satisfies certain prescribed stopping conditions such as the duality gap μ falling below a tolerance. The typical number of iterations is between 15 and 50 and is usually fairly independent of the problem size. Moreover, in [19] it is shown that each interior-point iteration takes O(n3) time and O(n2log n) bytes for solving an SOCP formulation of the Rudin–Osher–Fatemi model [33].

Section snippets

Preliminaries

In practice grey-scale images are represented as 2-dimensional matrices, whose elements give the grey-scale values of corresponding pixels. In this paper we restrict our discussion to square domains in R2 and hence n × n real matrices (denoted by Mn×n) for the sake of simplicity.

Let f  Mn×n be an observed image and let u denote the cartoon and v denote the texture and/or noise in f such that (f, u, v) satisfiesfi,j=ui,j+vi,j,fori,j=1,,n.

fi,j, ui,j, and vi,j are, respectively, the grey-scale values

Numerical results

In this section, we present numerical results for the three cartoon-texture decomposition models and compare them. In all cases we solved the Lagrangian version (4) of the TV-L1 model, the constraint version (5) of Meyer’s model, and the residual-free version (45) of the Vese–Osher (VO) model (8) with p = 1.

We used the commercial optimization package Mosek [28] as our SOCP solver. Mosek is designed to solve a variety of large-scale optimization problems, including SOCPs. Before solving a

Conclusions

In this work, we have computationally studied three total variation based models with discrete inputs: the Meyer, VO, and TV-L1 models. We showed how to formulate these models as second-order cone programs, which we solved by an interior-point method. We tested these models using a variety of 1D signals and 2D images to reveal their differences in decomposing inputs into their cartoon and oscillating/small-scale/texture parts. The Meyer model tends to capture the pattern of the oscillations in

References (39)

  • L. Rudin et al.

    Nonlinear total variation based noise removal algorithms

    Physica D

    (1992)
  • F. Alizadeh et al.

    Second-order cone programming

    Mathematical Programming

    (2003)
  • S. Alliney

    Digital filters as absolute norm regularizers

    IEEE Transactions on Signal Processing

    (1992)
  • S. Alliney

    Recursive median filters of increasing order: a variational approach

    IEEE Transactions on Signal Processing

    (1996)
  • S. Alliney

    A property of the minimum vectors of a regularizing functional defined by means of the absolute norm

    IEEE Transactions on Signal Processing

    (1997)
  • G. Aubert et al.

    Modeling very oscillating signals, application to image processing

    Applied Mathematics and Optimization

    (2005)
  • J.F. Aujol, G. Aubert, L. Blanc-Feraud, A. Chambolle, Decomposing an image: application to textured images and SAR...
  • J.F. Aujol et al.

    Image decomposition into a bounded variation component and an oscillating component

    Journal of Mathematical Imaging and Vision

    (2005)
  • J.F. Aujol et al.

    Dual norms and image decomposition models

    International Journal of Computer Vision

    (2005)
  • J.F. Aujol et al.

    Structure-texture image decomposition - modeling, algorithms, and parameter selection

    International Journal of Computer Vision

    (2006)
  • A. Chambolle

    An algorithm for total variation minimization and applications

    Journal of Mathematical Imaging and Vision

    (2004)
  • A. Chambolle, Total variation minimization and a class of binary MRF models, in: Energy Minimization Methods in...
  • T.F. Chan et al.

    Aspects of total variation regularized L1 function approximation

    SIAM Journal on Applied Mathematics

    (2005)
  • T. Chen, T. Huang, W. Yin, X.S. Zhou, A new coarse-to-fine framework for 3D brain MR image registration, in: Computer...
  • T. Chen, W. Yin, X.S. Zhou, D. Domaniciu, T. Huang, Illumination normalization for face recognition and uneven...
  • T. Chen et al.

    Total variation models for variable lighting face recognition

    IEEE Transactions of Pattern Analysis and Machine Intelligence (PAMI)

    (2006)
  • G. Chung, T. Le, L. Lieu, N. Tanushev, L. Vese, Computational methods for image restoration, image segmentation, and...
  • J. Darbon, M. Sigelle, Image restoration with discrete constrained total variation, part I: fast and exact...
  • J. Garnett, T. Le, Y. Meyer, L. Vese, Image decompositions using bounded variation and homogeneousBesov spaces, Tech....
  • Cited by (62)

    • Structure-aware adaptive bilateral texture filtering

      2022, Digital Signal Processing: A Review Journal
    • Cartoon and texture decomposition for color image in opponent color space

      2022, Applied Mathematics and Computation
      Citation Excerpt :

      Image decomposition is an important problem in image processing and computer vision [1,4,5,12,26–28,35,41,45] where a natural image is regarded as a superposition of cartoon component and texture component.

    • Infrared dim target detection based on total variation regularization and principal component pursuit

      2017, Image and Vision Computing
      Citation Excerpt :

      In other words, TV norm represents the smoothness of a given image. It is also widely used in image decomposition [25], which can decompose an image into two parts: one is uncorrelated random patterns, while the other is sharp edges and piecewise-smooth components [26]. By minimizing the TV norm of an image, the smooth inner surface will be preserved while retaining crisp edges.

    • A multiscale image segmentation method

      2016, Pattern Recognition
      Citation Excerpt :

      He proposed that the cartoon component should be modeled by using the total variation (TV) norm [27], and the oscillating texture part or noise should be modeled by using the G-norm, which is widely investigated with numerous numerical implementations and applications [28–33]. Yin et al. proposed the TV-L1 model (the model using the total variation regularization and an L1 fidelity term) for decomposing an image into the features of different scales [34,35], and compared several models for image texture extraction in [36]. They established the equivalence between the TV-L1 model and certain geometric optimization problems, which was used to show that this model decomposes an image into components of different scales.

    View all citing articles on Scopus

    Research supported by NSF Grants DMS-01-04282, DNS-03-12222 and ACI-03-21917, ONR Grants N00014-03-1-0514 and N00014-03-0071, and DOE Grant GE-FG01-92ER-25126.

    View full text