Abstract

3-D full waveform inversion (FWI) of seismic wavefields is routinely implemented with explicit time-stepping simulators. A clear advantage of explicit time stepping is the avoidance of solving large-scale implicit linear systems that arise with frequency domain formulations. However, FWI using explicit time stepping may require a very fine time step and (as a consequence) significant computational resources and run times. If the computational challenges of wavefield simulation can be effectively handled, an FWI scheme implemented within the frequency domain utilizing only a few frequencies, offers a cost effective alternative to FWI in the time domain. We have therefore implemented a 3-D FWI scheme for elastic wave propagation in the Fourier domain. To overcome the computational bottleneck in wavefield simulation, we have exploited an efficient Krylov iterative solver for the elastic wave equations approximated with second and fourth order finite differences. The solver does not exploit multilevel preconditioning for wavefield simulation, but is coupled efficiently to the inversion iteration workflow to reduce computational cost. The workflow is best described as a series of sequential inversion experiments, where in the case of seismic reflection acquisition geometries, the data has been laddered such that we first image highly damped data, followed by data where damping is systemically reduced. The key to our modelling approach is its ability to take advantage of solver efficiency when the elastic wavefields are damped. As the inversion experiment progresses, damping is significantly reduced, effectively simulating non-damped wavefields in the Fourier domain. While the cost of the forward simulation increases as damping is reduced, this is counterbalanced by the cost of the outer inversion iteration, which is reduced because of a better starting model obtained from the larger damped wavefield used in the previous inversion experiment. For cross-well data, it is also possible to launch a successful inversion experiment without laddering the damping constants. With this type of acquisition geometry, the solver is still quite effective using a small fixed damping constant. To avoid cycle skipping, we also employ a multiscale imaging approach, in which frequency content of the data is also laddered (with the data now including both reflection and cross-well data acquisition geometries). Thus the inversion process is launched using low frequency data to first recover the long spatial wavelength of the image. With this image as a new starting model, adding higher frequency data refines and enhances the resolution of the image. FWI using laddered frequencies with an efficient damping schemed enables reconstructing elastic attributes of the subsurface at a resolution that approaches half the smallest wavelength utilized to image the subsurface. We show the possibility of effectively carrying out such reconstructions using two to six frequencies, depending upon the application. Using the proposed FWI scheme, massively parallel computing resources are essential for reasonable execution times.

1 INTRODUCTION

In the last decade, advances in multiprocessor computer systems and algorithms have motivated the development of a 3-D seismic full waveform inversion (FWI) technique that can recover subsurface physical parameters (e.g. velocities, density or other seismic attributes), based on data fitting and iterative optimization. This technology was first suggested in the 1980s for the time domain (Tarantola 1984,1986). Because FWI in the time domain exploits explicit time stepping, waveform inversion requires very fine time steps necessary for accurately simulating broadband seismic traces, which lead to significant consumption of computational resources and long simulation times; time windowing the seismic trace may help reduce this requirement for the time step, but may also reduce the information content. The application of multiple sources and receiver sets in industrial-sized seismic surveys, along with the difficulty of an unknown source wavelet estimation, further compounds the problem (Shipp & Singh 2002; Sheen et al.2006).

The first discussion about FWI in the frequency domain was initiated by Pratt (1990). The appeal of frequency-domain FWI is its computational efficiency under applicable conditions, achieved by limiting the inversion to a few discrete frequencies (Sirgue & Pratt 2004) where the unknown source wavelet can be estimated simultaneously during the inversion process (Pratt 1999; Shin & Min 2006). These efficiencies are clearly evident in a 2-D FWI frequency-domain formulation, because factorization and solution of the linear system required for wavefield simulation are computationally practical at each frequency with multiple right-hand sides (Marfurt 1984). This system is solved efficiently for multiple sources using a direct solver, which performs one LU factorization of the matrix, followed by a forward and backward substitution per source. However, this approach is difficult to design for 3-D wave equations (acoustic and elastic), because the frequency-domain direct solver requires extensive memory for real-sized problems (Operto et al.2007). While Wang et al. (2012a) has reported progress for the 3-D Helmholtz equation applied to acoustic wavefields, the problem is significantly worse in the case of elastic wavefield simulation, because it is now nearly an order of magnitude larger in size. Nevertheless, Wang et al. (2012b) have reported some encouraging results for simulation of elastic wave fields using a direct solver approach.

While FWI can provide high-resolution quantitative images of seismic attributes, in either domain, it comes with very high computational costs, particularly in 3-D. Therefore, the most successful and cost efficient applications of FWI are 2-D (Shipp & Singh 2002; Operto et al.2004; Brenders & Pratt 2007; Shin et al.2010). The 3-D case have been considered by Sirgue et al. (2008), Ben-Hadj-Ali et al. (2008), Warner et al. (2008), Pyun et al. (2008), Plessix (2009), Guasch et al. (2012) and Wang et al. (2012). Among the existing publications on 3-D FWI, the majority employ an acoustic approximation (assuming there is no shear wave). Published studies based on 3-D elastic wave inverse modelling (Pyun et al.2008; Guasch et al.2012) are much fewer, due to its much higher computational cost, as we have already mentioned.

It should be clear that an effective 3-D FWI in the frequency domain requires a robust solver. Even with the advances made with direct solvers the appeal of iterative solvers is strong, since they require far less computational resources and memory. While existing iterative solvers are far from optimal, new approaches for algorithms (Sonneveld & van Gijzen 2008), and pre-conditioners (Erlangga & Nabben 2008) have been developed so it is now possible to efficiently solve linear systems of sizes relevant to 3-D FWI (Plessix 2009). With a robust iterative solver, the advantages of 3-D frequency-domain FWI become apparent. It has the advantage of requiring only a few frequencies in an inversion experiment (Warner et al.2008; Plessix 2009). In this paper, we will use the iterative solver proposed by Sonneveld & van Gijzen (2008) and realized for the solution of damped elastic wavefields in Laplace–Fourier domain, using a scheme proposed by Petrov & Newman (2012). We will show that the time to solution for this problem is fast enough that we can use the solver within a 3-D FWI framework. While the solver does not exploit a multilevel pre-conditioner for wavefield simulation, we have coupled it very efficiently to the inversion iteration workflow to reduce its overall computational cost.

A major difficulty of FWI is related to the ill-posed nature and non-linearity of the inverse problem, which is generally formulated as a least-squares local optimization problem. Particularly problematic is the non-linearity exhibited by the presence of local and multiple minima in the least-squares error functional (Virieux & Operto 2009). The ill-posedness of FWI mainly arises from the lack of low frequencies in the source bandwidth and the incomplete illumination of the subsurface provided by conventional seismic surveys. Consequently, the results strongly depend on the starting model. The presence of noise in the data can also have adverse effects on image quality, particularly when such data are greatly contaminated with noise in the lower frequency band, such as in marine data acquisition below frequencies of several hertz. If such noise is not an issue or can be effectively suppressed, several hierarchical multiscale strategies that proceed from low frequencies to higher frequencies can potentially mitigate the non-linearity of the inverse problem (Pratt 1990; Bunks et al.1995; Sirgue & Pratt 2004; Brossier et al.2009). Most 3-D FWI inversion methods (time or frequency) are based on gradient approaches, such as the non-linear conjugate gradient (NLCG) method (Pratt 1999; Sirgue & Pratt 2004; Ben-Hadj-Ali et al.2008). Higher order minimization schemes such as Gauss–Newton and full Newton methods (see Pratt et al.1998; Hu et al.2009) have appeal, but can be prohibitively expensive in 3-D.

There are three main approaches to waveform inversion in the frequency domain: (1) the multifrequency simultaneous inversion (Shin & Min 2006); (2) the sequential single-frequency inversion (Sirgue & Pratt 2004; Ben-Hadj-Ali et al.2008) and (3) the combination of the two approaches (Pratt 1999). In the multifrequency simultaneous inversion, all selected frequencies are inverted jointly as in the time domain. However, this approach is very expensive, because it is necessary to process not only a large number of shots but additionally a large number of frequencies, simultaneously. Moreover, to balance the contributions from the different frequency components in the data, careful data-weighting schemes must be applied (Hu et al.2009). The sequential single-frequency inversion requires less computational resources than the multifrequency simultaneous inversion. Also, the issue frequency data-weighting is irrelevant. As a rule, the low-frequency data are less non-linear with respect to the model than high-frequency data. Thus the inversion can be successfully performed proceeding from low to high frequencies (Sirgue & Pratt 2004; Ben-Hadj-Ali et al.2008; Shin & Cha 2008, 2009).

Finally, the modeller can also combine the multifrequency simultaneous inversion approach with the sequentially ordered single-frequency inversion approach, where frequencies are binned into groups. All frequencies in each group are inverted simultaneously, and all groups are inverted sequentially, commencing from low to high frequencies (Pratt 1999; Brenders & Pratt 2007). From our experience, especially with regard to computational resources, we consider a sequentially ordered single-frequency inversion as most efficient.

To better eliminate the problem of local minima and to recover the long wavelength components of the velocity model, Brown et al. (2005), Shin & Cha (2008) proposed to use the Laplace-domain waveform inversion. Later, Shin & Cha (2009) suggested the waveform inversion in the Laplace–Fourier domain. Waveform inversion in the Laplace and Laplace–Fourier domain tries to invert Laplace-transformed wavefields, which are zero- and low-frequency components of damped wavefields using several damping constants. Application of the multiscale approach now includes not only sequential increasing frequencies, but also a range of damping factors. Motivation for extending the FWI imaging process into the Laplace domain is that the objective functional is smoother and exhibits fewer local minima than the conventional frequency-domain waveform inversion (Brown et al.2005).

In this paper, we propose a sequentially ordered single frequency 3-D elastic waveform inversion scheme using the quadratic error functional in the Laplace–Fourier domain. Our algorithm sequentially inverts single complex frequency data, which consists of the ordinary frequency and Laplace damping constant, and is designed to find an elastic attribute model from a simple initial starting model. First, we review the theory of Laplace–Fourier domain elastic wavefield modelling and inversion. Following this, we describe our multiscale strategy to mitigate the non-linearity of the 3-D elastic inverse problem. This strategy involves two loops over frequencies and damping constants in the inversion algorithm. Next, we describe a massively parallel implementation of FWI algorithm for imaging 3-D elastic media. Finally, we apply the FWI algorithm to three synthetic examples of increasing complexity, with the aim of validating the algorithm for complex media reconstructions of elastic attributes. Because of the complexity and expensive computer cost of 3-D elastic FWI, we apply our scheme to modest-sized problems to investigate its performance and define an effective workflow. Results for large-scale models and data sets will be reported in subsequent studies. Through a synthetic application of several models, including cross-well and seismic reflection acquisition geometries, we demonstrate that our strategy of sequential inversion of damped wavefields can improve the quality of the inversion results and reduce the total number of inversion iterations, by exploiting laddered frequencies and damping constants. We show that it is possible to reconstruct elastic attributes of the subsurface at a resolution that approaches half the smallest wavelength utilized to image the subsurface.

2 ELASTIC WAVEFORM INVERSION THEORY

2.1 Regularized least squares

For our purposes, we define the elastic FWI problem as determining an elastic attribute model (compressional and shear velocities, or bulk or shear moduli and mass density), which minimizes an objective functional expressed by the residuals between the model responses and the observed data. Following the inverse problem formulation in the frequency domain (Pratt et al.1998) and its extension in the Laplace–Fourier domain for acoustic problems (Sirgue & Pratt 2004; Shin & Cha 2009), we solve the inverse problem by minimizing the error functional and model smoothness constraint in the L2 norm:
\begin{equation} \phi ({\bf m}) = \sum\limits_{s_k } {\sum\limits_q {\frac{1}{2}\left[ {{\bf d}_q^{{\rm obs}} \left( {s_k } \right) - {\bf d}_q^{{\rm sim}} ({\bf m},s_k )} \right]^H {\bf E}^H {\bf E}\left[ {{\bf d}_q^{{\rm obs}} \left( {s_k } \right) - {\bf d}_q^{{\rm sim}} ({\bf m},s_k )} \right]} } + \frac{1}{2}\lambda {\bf m}^T {\bf W}^T {\bf Wm}. \end{equation}
(1)
In eq. (1)|${\bf d}_q^{obs} \left( {s_k } \right),{\bf d}_q^{sim} ({\bf m},s_k )$| are the observed and predicted data vectors, with subscript q indicating the source position, W the regularization matrix, E the diagonal matrix of weights defined by the data error, m the model parameters, λ the regularization parameter to balance data error and model smoothness and symbols ‘H, T ’ the Hermitian conjugate and transpose operations, respectively. The observed data vectors consist of the Laplace–Fourier image of elastic displacement velocities, obtained from the measured time-domain seismic wavefield data:
\begin{equation} {\bf d}_q^{{\rm obs}} \left( {s_k } \right) = \int\limits_0^\infty {{\bf d}_q^{{\rm obs}} \left( t \right){\rm e}^{ - s_k t} {\rm d}t} \end{equation}
(2)
with complex frequency sk = σk + iωk, where σk is the Laplace damping constant and ωk is the angular frequency with |$i = \sqrt { - 1} $|⁠.
Predicted data |${\bf d}_q^{sim} \left( {{\bf m},s_k } \right) = \left[ {d_{q1}^{{\rm sim}} \left( {{\bf m},s_k } \right),d_{q2}^{{\rm sim}} \left( {{\bf m},s_k } \right),...} \right]^{\rm T}$| are defined by the displacement velocity field and depend upon the model parameters m
\begin{equation} {\bf d}_q^{sim} \left( {{\bf m},s_k } \right) = {\bf \hat G}_q {\bf v}_q \left( {{\bf m},s_k } \right), \end{equation}
(3)
where |${\bf \hat G}_q $| is an interpolation operator applied to the calculated velocity field in the vicinity of the detector for a source with index q. If we consider elastic medium parameters of density ρ or buoyancy b = 1/ρ, bulk and shear moduli κ, μ, so |${\bf m} = \left( {{\bf b},{\boldsymbol \kappa },{\boldsymbol \mu }} \right)$|⁠, the velocity components specified in eq. (3), vq = (vx, vy, vz), satisfy the system of elastic equations for each complex frequency sk, which may be directly obtained taking the Laplace–Fourier transform of the time-domain system (Virieux 1986):
graphic
Here ταβ; α, β ∈ (x, y, z) are the stress tensor components, |$\left( {f_x^k ,f_y^k ,f_z^k } \right)$| are the body forces per unit volume, and symbols ∂x, ∂y, ∂z denote the partial differential operators |$\frac{\partial }{{\partial x}},\frac{\partial }{{\partial y}},\frac{\partial }{{\partial z}}$|⁠, respectively.
The non-reflecting condition for the wave-field components are applied at the boundaries of the region Ω where the media parameters m are to be imaged. Additionally, the free-surface-boundary condition may be applied at the surface z = 0 for the stress components:
\begin{equation} \tau _{\alpha z} \left( {x,y,z = 0} \right) = 0;\;\alpha \in \left( {x,y,z} \right).\end{equation}
(5)
When the system of eq. (4) is approximated with finite differences (Petrov & Newman 2012) using a staggered grid [Virieux (1986) and Graves (1996)], a linear system results:
\begin{equation} {\bf K}_q {\bf v}_q = {\bf f}_q ,\;\;{\bf K}_q = {\bf I} - \left\langle {\bf b} \right\rangle {\bf D}_\tau \cdot \left( {\left\langle {{\bf k\mu }} \right\rangle \circ {\bf D}_v } \right), \end{equation}
(6)
where |${\bf v}_q {\bf = }\left( {{\bf v}_{\bf x} {\bf ,v}_{\bf y} {\bf ,v}_{\bf z} } \right)^{\bf T} ,{\bf f}_q^k {\bf = }\left( {{\bf f}_x^k {\bf ,f}_y^k {\bf ,f}_z^k } \right)^{\bf T} $| is the source vector in the Laplace–Fourier domain |$\left\langle {{\bf k\mu }} \right\rangle ,\;\left\langle {\bf b} \right\rangle$| are block matrices of the averaged elastic parameters and Dτ; Dv are block matrices of the finite-difference (FD) operators. Their explicit expressions are presented in the  Appendix. The sign ° is the Hadamard product of matrices (entry-wise product).

The parameters that control model smoothness are the regularization matrix, W, which consists of a FD approximation to the gradient operator (∇m), and the regularization parameter λ, which is used to control the amount of smoothness to be incorporated into the inverse model. Large values of λ will produce very smooth models, at the expense of poorer fits to the observed data. Small parameters give superior data fits, but the resulting models can be non-physical. The typical modelling strategy is to run the inversion using several values of λ and attempt to find an acceptable match to data within observational errors. Of those models that match the data, the model with the smoothest set of characteristics is selected. This model will correspond to the largest acceptable regularization parameter.

We minimize the objective functional in eq. (1) using the NLCG method (Fletcher & Reeves 1964; Polyak & Ribière 1969) following the implementations of Newman & Alumbaugh (1997, 2000) and Commer & Newman (2008). The NLCG method requires calculations of a gradient, search direction, and a step length value along the given search direction. The flowchart of our algorithm is described in detail by Newman & Alumbaugh (2000).

2.2 Computation of the gradients

The gradient of the objective function (1) is formally written as
\begin{equation} \nabla \phi = \nabla \phi _d + \lambda \nabla \phi _m , \end{equation}
(7)
where ϕd, ϕm are functionals that describe the data misfit and the model smoothness constraint, respectively. Evaluation of ∇ϕm directly leads to
\begin{equation} \nabla \phi _m = {\bf W}^T {\bf Wm}, \end{equation}
(8)
and for the data misfit part, we obtain
\begin{equation} \nabla \phi _d = - \sum\limits_{s_k } {\sum\limits_q {{\mathop{\rm Re}\nolimits} \left\{ {\left[ {{\bf d}_q^{{\rm obs}} \left( {s_k } \right) - {\bf d}_q^{{\rm sim}} ({\bf m},s_k )} \right]^H {\bf E}^H {\bf E}\nabla \left[ {{\bf d}_q^{{\rm sim}} ({\bf m},s_k )} \right]} \right\}} } \nabla \left[ {{\bf d}_q^{{\rm sim}} \left( {{\bf m},s_k } \right)} \right] = {\bf G}_q^T \nabla \left[ {{\bf v}_q \left( {{\bf m},s_k } \right)} \right], \end{equation}
(9)
where |${\bf G}_q^T $| is an interpolation matrix to the measurement points, which is the FD approximation of operator |${\bf \hat G}_q $|⁠.
The gradient of velocities ∇[vq(m, sk)] may be obtained directly using the system described by eq. (6). So for the model parameters in the pth cell and qth source, we have
\begin{equation} \begin{array}{lll} \displaystyle\frac{{\partial {\bf v}_q }}{{\partial b_p }} = {\bf K}_q^{ - 1} \displaystyle\frac{{\partial \left\langle {\bf b} \right\rangle }}{{\partial b_p }}\left\langle {\bf b} \right\rangle ^{ - 1} {\bf v}_q , \\ \displaystyle\frac{{\partial {\bf v}_q }}{{\partial \mu _p }} = {\bf K}_q^{ - 1} \left\langle {\bf b} \right\rangle {\bf D}_\tau \left( {\displaystyle\frac{{\partial \left\langle {{\bf k\mu }} \right\rangle }}{{\partial \mu _p }} \circ {\bf D}_v {\bf v}_q } \right) \\ \displaystyle\frac{{\partial {\bf v}_q }}{{\partial \kappa _p }} = {\bf K}_q^{ - 1} \left\langle {\bf b} \right\rangle {\bf D}_\tau \left( {\displaystyle\frac{{\partial \left\langle {{\bf k\mu }} \right\rangle }}{{\partial \kappa _p }} \circ {\bf D}_v {\bf v}_q } \right), \\ \end{array} \end{equation}
(10)
where vq is the solution of the forward problem described by eq. (6) for the qth source, which for simplicity may be written as |${\bf v}_q = {\bf K}_q^{ - 1} {\bf f}_q$| and the block matrices |$\frac{{\partial \left\langle {\bf b} \right\rangle }}{{\partial b_p }},\frac{{\partial \left\langle {{\bf k\mu }} \right\rangle }}{{\partial \mu _p }}_\tau ,\frac{{\partial \left\langle {{\bf k\mu }} \right\rangle }}{{\partial \kappa _p }}$| are given by
graphic
Thus the gradient components ∇ϕd may be rewritten as
\begin{equation} \begin{array}{l} \displaystyle\frac{{\partial \phi _d }}{{\partial b_p }} = - \sum\limits_{s_k } {\sum\limits_q {{\mathop{\rm Re}\nolimits} \left\{ {{\bf y}_q^H \left( {{\bf E}^H {\bf E}} \right){\bf \xi }_{q,p} } \right\}} } ,{\bf \xi }_{q,p} = \displaystyle\frac{{\partial \left\langle {\bf b} \right\rangle }}{{\partial b_p }}\left\langle {\bf b} \right\rangle ^{ - 1} {\bf v}_q \\ \displaystyle\frac{{\partial \phi _d }}{{\partial \mu _p }} = - \sum\limits_{s_k } {\sum\limits_q {{\mathop{\rm Re}\nolimits} \left\{ {{\bf y}_q^H \left( {{\bf E}^H {\bf E}} \right){\bf \eta }_{q,p} } \right\}} } ,{\bf \eta }_{q,p} = \left\langle {\bf b} \right\rangle {\bf D}_\tau \left( {\displaystyle\frac{{\partial \left\langle {{\bf k\mu }} \right\rangle }}{{\partial \kappa _p }} \circ {\bf D}_v {\bf v}_q } \right), \\ \displaystyle\frac{{\partial \phi _d }}{{\partial \kappa _p }} = - \sum\limits_{s_k } {\sum\limits_q {{\mathop{\rm Re}\nolimits} \left\{ {{\bf y}_q^H \left( {{\bf E}^H {\bf E}} \right){\bf \zeta }_{q,p} } \right\}} } ,{\bf \zeta }_{q,p} = \left\langle {\bf b} \right\rangle {\bf D}_\tau \left( {\displaystyle\frac{{\partial \left\langle {{\bf k\mu }} \right\rangle }}{{\partial \mu _p }} \circ {\bf D}_v {\bf v}_q } \right). \\ \end{array} \end{equation}
(12)
Note that y is the solution of the linear system with Hermitian transpose matrix KH, with the right-hand part of the equation defined by the complex conjugation of data misfit vector |${\bf g}_q = {\bf G}_q \left( {{\bf d}_q^{obs} - {\bf d}_q^{sim} } \right)$|
\begin{equation} {\bf K}_q^H {\bf y}_q = {\bf \bar g}_q . \end{equation}
(13)
The vector |${\bf f}_{q,p}^v = \left( {{\bf \xi }_{q,p} ,{ \bf\eta }_{q,p} ,{\bf \zeta }_{q,p} } \right)$| is the virtual source with respect to the qth source.
The most efficient preconditioning is to pre-multiply the gradient by an approximation to the inverse of Hessian. The basic idea of the pre-conditioner is to alter the gradient such that it better approximates the Newton/Gauss–Newton direction, thereby significantly improving the convergence of the inversion iteration (Pratt et al.1998). Because of computational costs, we used the diagonal of the pseudo-Hessian matrix suggested by Choi et al. (2008) as a pre-conditioner:
\begin{equation} diag\left( {\bf H} \right) = \sum\limits_{s_k } {\sum\limits_q {{\mathop{\rm Re}\nolimits} \left\{ {\left( {{\bf f}_{q,1}^v } \right)^H {\bf f}_{q,1}^v ,\left( {{\bf f}_{q,2}^v } \right)^H {\bf f}_{q,2}^v ,...,\left( {{\bf f}_{q,3N}^v } \right)^H {\bf f}_{q,3N}^v } \right\}} } . \end{equation}
(14)
The gradient vector may be altered in two ways during the preconditioning step. In the first case, at the ith iteration step, the preconditioning matrix M(i) is defined as
\begin{equation} {\bf M}_{(i)} = diag({\bf H}_{\left( i \right)} ) + \lambda {\bf W}^T {\bf W}\end{equation}
(15)
and the gradient vector is scaled by
\begin{equation} \nabla \phi ^{new} = \left( {diag({\bf H}_{\left( i \right)} ) + \lambda {\bf W}^T {\bf W}} \right)^{ - 1} \nabla \phi . \end{equation}
(16)
In the second case, M(i) = I and only the data misfit part of gradient ∇ϕd is scaled by relation
\begin{equation} \nabla \phi _d^{{\rm new}} = \left[ {{\bf I}\gamma + diag({\bf H}_{\left( i \right)} )} \right]^{ - 1} \nabla \phi _d ,\;\;\nabla \phi ^{{\rm new}} = \nabla \phi _d^{{\rm new}} + \nabla \phi _m , \end{equation}
(17)
where γ is the damping factor, which we estimated as
\begin{equation} \gamma \approx 10^{ - 2} tr \left({\bf H}_{( i)} \right) / \left( 3N \right) ,\end{equation}
(18)
with N representing the number of computational cells and tr is the trace of pseudo-Hessian matrix.

2.3 Logarithmic parameters

The minimization problem defined by eq. (1) without additional constraints often leads to model parameters that are non-positive. To restrict the buoyancy and shear and bulk modulus to positive quantities, one can invert for logarithmic parameters. Following Newman & Alumbaugh (1997) and Commer & Newman (2008), we reformulate the inverse problem for logarithmic parameters with lower lbp and upper ubp bounds, which requires that elements of m be redefined as
\begin{equation} m'_p = \log \left( {\frac{{m_p - lb_p }}{{ub_p - m_p }}} \right),\;\;ub_p < m_p < lb_p . \end{equation}
(19)
The gradient in eq. (12) is therefore modified. For a specific attribute mp (buoyancy b, bulk or shear moduli κ, μ), we have in the |$m^{\prime} _p $| parameter log space:
\begin{equation} \frac{{\partial \phi _d }}{{\partial m'_p }} = \frac{{\partial \phi _d }}{{\partial m_p }}\frac{{\partial m_p }}{{\partial m'_p }} = \frac{{\partial \phi _d }}{{\partial m_p }}\frac{{\left( {ub_p - m_p } \right)\left( {m_p - lb_p } \right)}}{{\left( {ub_p - lb_p } \right)}}. \end{equation}
(20)
Once |$m^{\prime} _p $| is updated in the NLCG iteration, the result in non-transformed space follows from the expression
\begin{equation} m_p = \frac{{ub_p + lb_p \exp \left( { - m'_p } \right)}}{{1 + \exp \left( { - m'_p } \right)}}. \end{equation}
(21)
Workflow of sequential waveform inversion in the Laplace–Fourier domain.

We employ a sequential, single-frequency Laplace–Fourier-domain waveform inversion approach for imaging across multiple scales, as proposed by Tarantola (1986), Bunks et al. (1995), Pratt (1999), Sirgue & Pratt (2004) and Shin & Cha (2009). Because the frequency is now complex-valued, s = σ + iω, consisting of a Laplace damping factor σ and the angular frequency ω, waveform inversion will involve multiple real valued frequencies and damping constants. The choices made in selecting these frequencies and damping constants, and their respective ordering, is critical for a successful outcome of the inversion experiment.

Although several sequential ordering schemes for waveform inversion in the Laplace–Fourier domain have been proposed (Shin et al.2010), we favour the multiscale inversion approach, wherein frequencies are mainly changed from low to high, following the strategy suggested by Bunks et al. (1995) and Sirgue & Pratt (2004), and where the Laplace damping constant is changed from large to small values. Fig. 1(a) illustrates this waveform inversion scheme. The proposed ordering is based upon two reasons. The first reason is concerned with local minima issues that arise in FWI due to cycle skipping (Virieux & Operto 2009). To address this problem, modellers typically begin the inversion process with low frequencies and proceed sequentially to higher frequencies, employing a multiscale imaging approach (Fichtner 2011). If the inversion procedure is initiated at too high a frequency, in the absence of a very good starting model, the complexity of high-frequency wave scattering and re-radiation in heterogeneous media will produce unacceptable results. While some regions of the inversion domain may change little during the inversion iteration, other regions of the model that are illuminated will be extremely sensitive to the discontinuities of the medium. These discontinuities may be considered as secondary sources of elastic waves. So rather than rendering an accurate map of the elastic parameters, the reflectivity at interfaces is emphasized.

Figure 1.

(a) The sequential downward waveform inversion in the Laplace–Fourier domain, (b) the truncated downward waveform inversion in the Laplace–Fourier domain with reverse loop.

As a rule, at lower frequencies, the functional in eq. (1) is smoother than at higher frequencies; hence, the probability of encountering a local minimum decreases with falling frequency. At low enough frequency, the wavelength is considerably larger than the sizes of typical heterogeneous structures, since the wavefields can only effectively sense smooth variations in the elastic attributes of the medium. Consequently, it is only possible to reconstruct a smooth map of the elastic attributes. Another way to make a smooth reconstruction from the wavefield, thus illuminating long-wavelength details in the image, is to increase the damping constant. This was demonstrated in the 3-D SEG/EAGE salt model example (Petrov & Newman 2012). However, the implementation of a large damping constant σ may lead to undesirable effects, because large attenuation restricts the depth-of-field penetration. This can be measured using specific ‘diffusion’ scale lengths
\begin{equation} \lambda _D^p = \frac{{\left\langle {V_p } \right\rangle }}{{\sqrt {f\sigma } }},\;\lambda _D^s = \frac{{\left\langle {V_s } \right\rangle }}{{\sqrt {f\sigma } }}, \end{equation}
(22)
where 〈Vp, s〉 is the P- and S-wave velocities, are averaged over inversion domain.

The other reason is concerned with the efficiency of the iterative solver used for the forward problem. The convergence of the numerical solution of the system in eq. (6) strongly depends on the damping parameter σ (Petrov & Newman 2012). Our simulations show that a twofold decrease in σ, for example σ ≤ 2 s−1, leads to a doubling of the number of iterations for the same level of solution accuracy. Thus, starting the inversion process at a large damping constant at a given frequency will result in a faster overall time to solution than starting with lower damping.

However, such an order of inversion (Fig. 1a) may not be necessarily optimal, because the implementation of large damping, proceeding from lower to upper frequencies, will smooth specific structures, which were obtained with a small damping constant during the previous inversion cycle. Thus, it seems more practical not to cover the full range of damping constants at lower frequencies, but rather to truncate them to insure that image resolution will be similar to that obtained with the largest damping constant used at the next frequency. Fig. 1(b) presents the truncated cycle for the sequential waveform inversion. We will show that this strategy, when combined with multiscale imaging in frequency, can be used to great advantage to produce high-resolution 3-D images of elastic attributes in a cost-effective manner.

Additionally, we would like to note that the final solution of inverse problem must satisfy all frequencies employed in the imaging experiment independent of strategy employed. Thus, it can be very useful after inversion at high frequencies to return to the lower frequencies (Fig. 1b). This reverse loop helps preserve the content of lower frequency data in the imaging process as data with higher frequency content is added to the inversion process. Guided by general considerations that the solution must satisfy all frequencies, the return to lower frequency should be made after completion of inversion at each new higher frequency, but we believe the reverse loop can still be efficiently employed after every other laddered frequency.

3 PARALLEL IMPLEMENTATION

The FWI in 3-D for each frequency and damping constant may require several hundred iterations to obtain an acceptable solution, which can be used in the next stage of the inversion work flow. Note further the requirement of at least three forward solutions per inversion iteration. This computationally demanding problem thus needs to be implemented on a distributive computing platform with sufficient resources, which include massively parallel computing cores with large amounts of available core memory. Our inversion code uses two levels of parallelism. One of them is over the data volume (seismic sources)—that is, different source and receiver sets are assigned to different cores for the computation. Hence, the wavefields arising from multiple sources can be run in parallel on different sets of cores, independently. Another level of parallelism is over the modelling domain—spatial parallelism, or domain decomposition, which distributes the model across banks of cores. All processor communication is carried out using the Message Passing Interface (MPI) software library.

3.1 Domain decomposition

The forward problem needed for FWI involves the solution of the linear system in eq. (6) for multiple sources. To solve this linear system in a distributed environment, we first split up the modelling domain into a Cartesian topology (Alumbaugh et al.1996): a forward modelling problem is solved among a number of nxyz = nx × ny × nz cores or processors. Petrov & Newman (2012) showed that it can be very effectively solved by the Induced Dimension Reduction method (Sonneveld & van Gijzen 2008; Onoue et al.2009), which is one of the iterative Krylov subspace methods. Each iteration involves matrix–vector products on each of the n processors. Unfortunately, to complete the product, values of the solution vector at the current Krylov iteration not stored on the processor must be passed by neighbouring processors to complete this operation. We split the Krylov iteration into three steps: (1) multiply v by matrix |$\left\langle {{\bf \lambda \mu }} \right\rangle \circ {\bf D}_{\bf \tau } $|⁠, (2) multiply the resulting vector by matrix 〈bDv and finally (3) add the diagonal terms proportional to v,
\begin{equation} \begin{array}{lll} {{\bf z = }\left\langle {{\bf \lambda \mu }} \right\rangle \circ {\bf D}_{\bf \tau } {\bf v}^{n - 1} ;} \\ {{\bf v}^n = \left\langle {\bf b} \right\rangle {\bf D}_{\bf v} {\bf z};} \\ {{\bf v}^n = {\bf v}^{n - 1} - {\bf v}^n .} \\ \end{array}\end{equation}
(23)
Each processor requires two communications as part of the construction of the Krylov iteration with six neighbouring processors. In addition to the message passing between neighbouring processors, several global communications, which involve all processors, required for a specific forward-modelling simulation, are carried out to complete the dot products necessary in the Krylov iteration.
To illustrate the performance of the parallel implementation, we consider a fixed-size problem (3-D SEG/EAGE model, Aminzadeh et al.1997) with a total of 588 × 588 × 261 grid cells, which was run on CRAY XT4. Fig. 2 shows speedup as a function of the number of used processors. Here, the fixed speedup is defined as the ratio between the elapsed time to execute a program on a single processor T1 and on a set of concurrent p processors—Tp:
\begin{equation} S_p = \frac{{T_1 }}{{T_p }}. \end{equation}
(24)
Figure 2.

Speedup from parallel processing for a fixed-sized forward problem (588 × 588 × 261).

These results show that on the CRAT XT4 machine, our code is able to achieve a scaled speedup of 90 per cent on 4000 processors for the second-order scheme, and of 75 per cent for the fourth-order scheme. We note the decreasing efficiency of the solution beyond 4000 processors, because of the overhead due to message passing, which becomes a limiting factor in time-to-solution performance. The same parallelization scheme was applied to solve linear system (14) with Hermitian transpose matrix KH, which is required for gradient calculation.

3.2 Data decomposition

To avoid message passing overhead and subsequent inefficiencies, a second level of parallelization is realized by distributing the data such that sources and corresponding receivers of a data set are assigned to specific groups of processors. Let ndata correspond to the number of such data groups. The total number of processors or tasks employed would then be ntot = ndatanxyz. Here, ndata copies of the forward problem are distributed across the parallel machine. The data decomposition enables keeping a balance between nxyz, the size of the forward problem (dictated by the size of its corresponding FD mesh) and the number of independent source activations and frequencies. At the same time, data groups can increase linearly with the total number of sources or shots employed in the imaging experiment, given the total number of computational tasks available.

Bottlenecks for data decomposition parallelization may arise from large Krylov convergence differences between the data groups. To achieve good load balancing, the sources are distributed among the data groups such that each group has a similar workload in terms of convergence of the Krylov solver. This can be estimated in advance with a trial inversion iteration. However, it should be noted that convergence characteristics are subject to changes during later stages of an inversion, owing to changing model properties. Thus, we make the workload distribution dependent only on the seismic sources frequencies and damping constants, while nxyz is kept constant for each data group. Fig. 3 shows speedup of the fixed size FWI test problem (129 × 40 × 47 grid cells) as a function of the number of processors. The modelling was performed with the second-order FD scheme and with two variants of model decomposition (nx = 6, ny, z = 2 or nx = 3, ny, z = 1), yielding nearly identical performance. Since data decomposition is highly parallel, a scaled speedup of about 80 per cent on 5000 processors is realized.

Figure 3.

Speedup from parallel processing for a fixed-sized inverse problem (129 × 40 × 47).

4 SYNTHETIC EXAMPLES

In this section, we present several numerical examples of 3-D FWI to validate the algorithm and to give some estimation of the computing cost of the approach. The inverted wavefields—including refractions, turning waves, and reflections—are accounted for simultaneously in the inversion. Synthetic data were generated by the Laplace–Fourier domain FD modelling technique (Petrov & Newman 2012) in conjunction with perfectly matched layers (PML; Hastings et al.1996; Kim & Pasciak 2010) and free-surface boundary conditions (Gottschammer & Olsen 2001). It is the same forward code as embedded in our FWI algorithm. However, we produced the forward data at much greater accuracy than we required for the predicted data within the inversion iteration. The synthetic data were generated using significantly smaller solution tolerances in the Krylov iterations (∼3 orders) than the tolerances employed within inversion for computing data predictions. For inversion, we use noise-free data obtained with the second-order scheme. All examples were computed on CRAY XE6 and CRAY XC30 at the National Energy Research Scientific Computing (NERSC) Center.

4.1 Inclusion model

First, we apply 3-D FWI to the simple velocity model composed of a homogeneous background with an inclusion. Velocities in the background medium are 2150 m s−1 for the S-wave and 4250 m s−1 for the P wave, with background density 2300 kg m–3. The inclusion consists of prisms with 1800 m s−1 for S-wave velocity, 3600 m s−1 for P-wave velocity and 2100 kg m–3 for density. For demonstration of the method sensitivity we used a general case of different S and P-waves space distributions (Figs 4a and b): assigned to the same prism but rotated about the Z-axis by 90°. Consequently, for the P wave, the size of inclusion is defined by |$L_x^p = 20\;{\rm m},\;L_y^p = 30\;{\rm m},\;L_z^p = 20\;{\rm m}$|⁠, while for the S wave it is |$L_x^S = 30\;{\rm m},\;L_y^S = 20\;{\rm m},\;L_z^S = 20\;{\rm m}$|⁠. The density has the same distribution as the P velocity.

Figure 4.

The synthetic example used to test the inversion algorithm (a) P velocity (background – 4250 m s−1, 3600 m s−1 – inclusion) and (b) S velocity (background – 2100 m s−1, 1800 m s−1 – inclusion).

Twelve wells surround the target with 14 horizontal xy-directed point dipole sources at 6 m interval straddling the target (Fig. 5). All three components of the wavefields were calculated in all other wells at 6 m intervals, excluding the source well.

Figure 5.

The synthetic example with wellbores positions and P-velocity inclusion geometry.

The model is discretized on a 46 × 46 × 57 grid with 2 m cubic cells, with PML layers spread along five grid cells on each side and in each direction. The inverted frequencies are 30, 50, 70, 100 and 140 Hz which were determined using principles described in (Sirgue & Pratt 2004). As the model and distances between wells are not large, the iterative solution of the forward problem (6) is sufficiently fast for all range of frequencies. So here, we used only one Laplace damping parameter σ = 1 s−1 which provides a good convergence of iterative solution and a small difference from pure Fourier image. Since the low frequencies (30 and 50 Hz) data are used in the inversion process, it is not necessity to use large damping parameters for additional smoothing of solutions. The initial model was the homogeneous media with background P and S velocities and the density.

In total, the data consist of 25 088 source–receiver pairs, with the inversion domain consisting of 120 612 cells and including PML layers. The inversion was performed with unweighted data, that is, using E = I in eq. (1), where I is the identity matrix.

The recovered image in Fig. 6 shows the location, geometry, and values of inclusion fairly well. Relative error for the P-wave velocity values is less than 1 per cent, for the S wave less than 2.5 per cent, and about 5 per cent for density. Unlike the other two parameters the density is very sensitive to the sequence of the Laplace–Fourier domain inversion process, and it is very hard to estimate using classical FWI theory. An acceptable density distribution is obtained only when the inversion process is begun at the lowest frequency data of 30 Hz.

Figure 6.

Reconstructed S- and P-wave velocities and the density for the synthetic example illustrated in Fig. 4 for different image slices at x = 0 m, y = 0 m, z = 47 m.

Readers interested in this problem can find a discussion about strategy for density reconstruction in work (Jeong et al.2012 and references therein). Because results of inversion are not so sensitive to density parameter as for shear and bulk moduli one can use some analytical relation between density and P- or S-wave velocities (Gardner et al.1974; Dey & Stewart 1997), which we will do in the next examples.

About 100–200 iterations for each frequency were needed to obtain this reconstruction with the normalized data error en at 2 per cent. en is defined by
\begin{equation} e_n = \frac{1}{{N_q }}\sqrt {\sum\limits_{s_k } {\sum\limits_q {\frac{{\left[ {{\bf d}_q^{{\rm obs}} \left( {s_k } \right) - {\bf d}_q^{{\rm sim}} ({\bf m},s_k )} \right]^H \left[ {{\bf d}_q^{{\rm obs}} \left( {s_k } \right) - {\bf d}_q^{{\rm sim}} ({\bf m},s_k )} \right]}}{{{\bf d}_q^{{\rm obs}} \left( {s_k } \right)^H {\bf d}_q^{{\rm obs}} \left( {s_k } \right)}}} } } \times 100\,{\rm per}\,{\rm cent}, \end{equation}
(25)
where Nq is the total amount of complex data used in the inversion. The time required to compute one inverse iteration on the 1000 cores of Cray XE6 ranges from ∼20 to 60 s, depending on the frequency.

4.2 Layered model

In the next example, we consider the 3-D inversion of a 2-D elastic medium for a data set collected using a cross-well measurement setup (Habashy et al.2011). The true S- and P-wave velocities are given in Figs 7(a) and (b), with the size of the inversion domain 76 m × 76 m × 200 m. Only part of the inversion domain is shown in Fig. 7; the cells outside this region are used to keep the boundary of the inversion domain at a distance. Our model consists of several layers that deviate from the horizontal and the water-injection region located between z = 38 m and z = 50 m. The survey uses two wells located at x = 24 m, y = 0 m and x = −24 m, y = 0 m, with 30 point (x and y)-directed dipole sources in the first well and 30 receivers in the other, and vice versa. Data consist of the two components (x and z) of the displacement velocity fields, with synthetic data generated by solving the forward problem numerically using a 3-D FD approach with the second-order scheme. In the inversion, we used unweighted data at 50, 70, 100, 140, 200 and 280 Hz; the damping constant was set at 1 s−1, as in the previous example. FD cell size is 2 m for frequencies 50–140 Hz and 1 m for 140–280 Hz; thus, in the simulation, 30 × 30 × 72 or 50 × 50 × 135 meshes are used, respectively.

Figure 7.

The true (a–b) and the inverted (c–d) distributions of (a, c) S-wave velocity (m s–1) and (b, d) P-wave velocity (m s–1).

The initial model that was used in the inversion is a homogeneous medium, with an S-wave velocity of 2000 m s−1 and a P-wave velocity of 3950 m s−1. To begin, the inversion was performed on a course grid with frequencies from 30 to 140 Hz; then we interpolated the obtained solution to the fine grid and continued the inversion process. There are only two independent parameters (P and S velocities) reconstructed here. Density was defined via the P velocity by the Gardner relationship (Gardner et al.1974):
\begin{equation} \rho = AV_p^{1/4} , \end{equation}
(26)
where parameter A is defined by 293. Note that in particular we choose an A value greater than the value 283, which corresponds to the exact relationship between the density and P-wave velocity, to estimate the sensitivity of the inversion to the error in the parametrization assumed by eq. (26).
Inversion results are shown in Figs 7(c) and (d). As it can be seen the resolution of the reconstructed S-wave velocity distribution is higher than that for the P wave similarly to Xiong et al. (2013) for the 2-D case. We connect this with two main factors: smaller wavelength of the S-wave relative to the P wave and higher sensitivity of system (6) to shear than for bulk module
\begin{equation} \frac{{\partial \phi _d }}{{\partial \mu _p }} \gg \frac{{\partial \phi _d }}{{\partial \kappa _p }}. \end{equation}
(27)
As mentioned above, the reconstructed results are not very sensitive to the density—hence the constraint placed on density via the Gardner relationship. We note that the errors observed in the density distribution without the Gardner constraint were compensated for by the deviation in share and bulk modulus properties, but the reconstructed P and S velocities were always close to the true values. For the sake of comparing the true and the inverted models in greater detail, we show in Fig. 8 the vertical P- and S-wave velocity profiles at x = y = 0 m for the true model (solid line) and the inverted model (dotted line). The overall shapes of the inverted vertical velocity profiles are very similar to the true shape. Typically, about 200–300 iterations for each frequency were needed to obtain this reconstruction with the normalized data error en approaching 3 per cent. We also included PML layers in the inversion domain, because omitting them can adversely affect the reconstructed model within the interwell region. The typical time required per inversion iteration, at a given frequency and damping constant, was about 100 s, using 1000 cores of the Cray XE6.
Figure 8.

The vertical S, P-wave velocity profiles for the true, inverted and initial models.

4.3 Marmousi model

In the final example, we use the truncated Marmousi-2 elastic model (Martin et al.2002; Xiong et al.2013) to test our 3-D inverse modelling algorithm. Figs 9(a) and (b) shows P- and S-wave velocities of the Marmousi-2 model, where portions of the left- and right-hand side, along with part of the water layer, are removed from the original model. This model is considered to be 2-D where it is invariant along its strike direction, denoted by the y-axis. The dimension of the inversion domain is 5.12 km × 1.56 km × 1.84 km and discretized into 129 × 40 × 47 grid cells. The cell size is 40 m × 40 m × 40 m.

Figure 9.

True (a–b) and final results (c–d) after 6 Hz data inversion (a, c) P-wave velocity (m s–1) and (b, d) S-wave velocity (m s–1) of the Marmousi-2 elastic model.

There are 210 dipole point sources uniformly located from x = 2840 to 7480 m along seven lines that span y = −480 to +480 m at 160 m intervals and at a depth of 160 m. A total of 870 receivers locations are equally placed from x = 2800 to 7520 m, along with fifteen lines that span y = −560 to +560 m at 80 m intervals; the inline y = 0 measures two components per detector (the x and z components of the velocity displacement field), while the broadside lines have three components per detector (all three x, y and z components of the velocity displacement field). Receiver depth is 320 m, and each source uses all available receivers. Weighted data are used where weights are proportional to the absolute value of the observed velocity field, that is |$ {\bf E} \sim \sqrt {\vert{{\bf d}^{\rm obs}}\vert}$|⁠, to compensate for attenuation and geometric spreading of the observed data with offset. The data are inverted with four different frequency sets, which include 1.5, 3, 5 and 6 Hz. Thus, at 6 Hz, we have at least 5 grid points per wavelength for the shear wave.

For the lowest frequency inversion and largest damping constant, the initial velocity model is composed of linearly increasing functions with depth, with P-wave velocity varying from 1500 to 2200 m s−1 and S-wave velocity varying from 1000 to 1500 m s−1. The density is again defined by the Gardner relationship in eq. (26), with parameter A = 284. We implemented a single-frequency inversion progressing from the lowest frequency to the highest, with reverse loop, as shown in Figs 1(a) and (b). For each complex frequency, we terminated the inversion when the value of the data misfit (normalized data error) en of (3.5) per cent was achieved, or the relative changes in the model parameters between two successive iterations was less than 0.1 per cent.

The best result of the inversion was obtained with a full data set that includes 1.5, 3, 5 and 6 Hz data. Results after the 6 Hz data inversion are shown in Figs 9(c) and (d) and Figs 10 and 11. As the figures show, the inversion generates very good results: not only have we captured the shape of the faults in the central region of the Marmousi model, but we have also revealed significant detail in this region, where the imaged velocities are computed very accurately. These results show that high spatial resolution can be achieved in the velocity models at relatively low frequencies, that is, 6 Hz. In Fig. 10, at a depth of 1300–1450 m, two neighbouring peaks are successfully separated at a distance of slightly less than half a wavelength. The details of the workflow process for the inversion at different frequencies are given on the left-hand side of Table 1. They confirm that the application of a large damping constant at the beginning of the inversion (at fixed frequency) enables reducing the cost of very expensive calculations employing a small damping constant later in the inversion procedure. During this inversion we applied a pre-conditioner based on the diagonal of the pseudo-Hessian matrix suggested by Choi et al. (2008), but it does not fundamentally solve the slow convergence of the inverse iteration.

Figure 10.

The vertical (a) P-, (b) S-wave velocity profiles for the true, inverted and initial models.

Figure 11.

The horizontal (a) P-, (b) S-wave velocity profiles for the true, inverted and initial models.

Table 1.

The details of regular 3-D inverse process showed in Fig. 1(a) for the full data set containing frequencies (1.5, 3, 5, 6 Hz), damping constants (1, 0.5, 0.25 s−1) with the initial inversion Laplace parameter σini = 1 (s−1); fini = 1.5 (Hz) and truncated inverse process (Fig. 1b) containing only large damping constants (from 6.0 to 3.0 s−1) for lower frequencies. Note that sequential ordering is from the top to bottom in the table.

Sequential downward waveformTruncated sequential downward
inversion with reverse loopswaveform inversion with reverse
showed in Fig. 1(a)loop showed in Fig. 1(b)
f (Hz)σ (s−1)Number of inverseNormalized dataσ (s−1)Number of inverseNormalized dataAverage number of iterations for
iterationserror en (per cent)iterationserror en (per cent) the forward problem solution
1.563443.3∼380
33033.5∼750
12824.8∼1300
0.51802.3∼4000
0.25652.4∼10 000
363622.7∼300
33413.1∼350
16423.811452.6∼1700
0.5882.3∼2000
0.25662.4∼5000
536173.1∼700
13062.813642.8∼1400
0.51482.30.5812.5∼3700
0.25922.40.25352.8∼6300
1.50.5782.3∼2000
31132.4∼1000
60.51263.20.51193.0∼4000
0.25803.10.25602.7∼10 000
Sequential downward waveformTruncated sequential downward
inversion with reverse loopswaveform inversion with reverse
showed in Fig. 1(a)loop showed in Fig. 1(b)
f (Hz)σ (s−1)Number of inverseNormalized dataσ (s−1)Number of inverseNormalized dataAverage number of iterations for
iterationserror en (per cent)iterationserror en (per cent) the forward problem solution
1.563443.3∼380
33033.5∼750
12824.8∼1300
0.51802.3∼4000
0.25652.4∼10 000
363622.7∼300
33413.1∼350
16423.811452.6∼1700
0.5882.3∼2000
0.25662.4∼5000
536173.1∼700
13062.813642.8∼1400
0.51482.30.5812.5∼3700
0.25922.40.25352.8∼6300
1.50.5782.3∼2000
31132.4∼1000
60.51263.20.51193.0∼4000
0.25803.10.25602.7∼10 000
Table 1.

The details of regular 3-D inverse process showed in Fig. 1(a) for the full data set containing frequencies (1.5, 3, 5, 6 Hz), damping constants (1, 0.5, 0.25 s−1) with the initial inversion Laplace parameter σini = 1 (s−1); fini = 1.5 (Hz) and truncated inverse process (Fig. 1b) containing only large damping constants (from 6.0 to 3.0 s−1) for lower frequencies. Note that sequential ordering is from the top to bottom in the table.

Sequential downward waveformTruncated sequential downward
inversion with reverse loopswaveform inversion with reverse
showed in Fig. 1(a)loop showed in Fig. 1(b)
f (Hz)σ (s−1)Number of inverseNormalized dataσ (s−1)Number of inverseNormalized dataAverage number of iterations for
iterationserror en (per cent)iterationserror en (per cent) the forward problem solution
1.563443.3∼380
33033.5∼750
12824.8∼1300
0.51802.3∼4000
0.25652.4∼10 000
363622.7∼300
33413.1∼350
16423.811452.6∼1700
0.5882.3∼2000
0.25662.4∼5000
536173.1∼700
13062.813642.8∼1400
0.51482.30.5812.5∼3700
0.25922.40.25352.8∼6300
1.50.5782.3∼2000
31132.4∼1000
60.51263.20.51193.0∼4000
0.25803.10.25602.7∼10 000
Sequential downward waveformTruncated sequential downward
inversion with reverse loopswaveform inversion with reverse
showed in Fig. 1(a)loop showed in Fig. 1(b)
f (Hz)σ (s−1)Number of inverseNormalized dataσ (s−1)Number of inverseNormalized dataAverage number of iterations for
iterationserror en (per cent)iterationserror en (per cent) the forward problem solution
1.563443.3∼380
33033.5∼750
12824.8∼1300
0.51802.3∼4000
0.25652.4∼10 000
363622.7∼300
33413.1∼350
16423.811452.6∼1700
0.5882.3∼2000
0.25662.4∼5000
536173.1∼700
13062.813642.8∼1400
0.51482.30.5812.5∼3700
0.25922.40.25352.8∼6300
1.50.5782.3∼2000
31132.4∼1000
60.51263.20.51193.0∼4000
0.25803.10.25602.7∼10 000

Our simulations show that the truncated cycle of inversion works quite well, obviating the need to perform the full cycle of inversions over all damping constants for all frequencies. In most cases, for low frequencies only large damping constants can be used for inversion, as illustrated in Fig. 1(b). Small damping is applied only to the last steps of the inversion process for high frequencies. One can see (in Figs 10 and 11) that the difference between the full and truncated cycles for the inverted model is slight. The truncated cycle workflow is shown on the right-hand side of Table 1.

A key requirement for successful wide-angle FWI is the presence of low frequencies within the field data. These low frequencies, less than several hertz, are often not easy to identify on time-domain displays, and Fourier amplitude spectra cannot easily distinguish between signal and noise. However, an acceptable signal-to-noise ratio for the Fourier domain may be achieved at the frequencies starting at 3 Hz (Warner et al.2013). Therefore, we have attempted to make the inversion for this case without the lowest frequency of 1.5 Hz present. We used the same configuration of sources–receivers and the initial model, but began the inversion using 3 Hz frequency data with a damping constant σini = 1 (s−1)–the same damping constant used to initiate the inversion in the previous reconstruction. The results are shown in Figs 12(a) and (b). Only the shallow part of the model is acceptable down to a depth of 1000 m. The deepest 3-D structures were not reconstructed, although it is possible to recognize some strongly reflecting interfaces, which are directly connected with P- and S-wave velocity discontinuities. In this case, it appears that the 1.5 Hz frequency information is obviously required for an acceptable reconstruction.

Figure 12.

The results after 6 Hz data inversion P-wave velocity (m s–1) and S-wave velocity (m s–1) with the lowest frequency of 3 Hz for different initial damping constants (a, b) σ = 1 (s−1); (c, d) σ = 3 (s−1), (e, f) σ = 6 (s−1).

Nevertheless, even in the absence of this low frequency data, we can still receive an acceptable reconstruction by increasing the damping constant. Figs 12(c) and (d) show the inverted results using σini = 3 (s−1) as an initial damping constant with 3 Hz data. These images are very close to those obtained with the 1.5 Hz frequency data set. All 3-D structures were reconstructed, and the estimated wave velocities are very close to the true values. However, some differences do appear near the left- and right-hand borders of the inversion domain. Increasing σini to 6 (s−1) also provides acceptable results, shown in Figs 12(e) and (f). However, they are worse than the results obtained with σini = 3 (s−1), particularly for the deepest part of the model (below 1600 m). Thus, it appears that wavefield attenuation becomes an issue when the damping constant is set too large. The workflows are presented in Table 2.

Table 2.

The details of 3-D inverse process for the truncated data set containing frequencies (3, 5, 6 Hz) with the initial lowest frequency fini = 3 (Hz) and three different initial damping parameters σini = 1, 3, 6 (s−1). Note once more that sequential ordering is from the top to bottom in the table.

σini = 1 (s−1)σini = 3 (s−1)σini = 6 (s−1)
f (Hz)σ (s−1)Number ofData errorσ (s−1)Number ofData errorσ (s−1)Number ofData errorAverage number
inverse iterationsen (per cent)inverse iterationsen (per cent)inverse iterationsen (per cent)of iterations
for forward
problem solution
33611749.3∼300
41299.2∼350
39512.535943.1∼350
2382.92652.4∼600
12083.11262.31233∼1700
0.51003.20.5113.10.5502.9∼2300
0.25883.20.2553.70.25173.5∼5000
534033.238943.1∼700
21063.12∼1000
14793.211803.212562.7∼1500
0.53263.20.51073.10.5714.2∼3500
0.252626.20.25883.30.25∼6000
31292.4∼1000
6612473∼2000
0.53373.60.53172.9∼4000
0.251263.60.25483.6∼10 000
σini = 1 (s−1)σini = 3 (s−1)σini = 6 (s−1)
f (Hz)σ (s−1)Number ofData errorσ (s−1)Number ofData errorσ (s−1)Number ofData errorAverage number
inverse iterationsen (per cent)inverse iterationsen (per cent)inverse iterationsen (per cent)of iterations
for forward
problem solution
33611749.3∼300
41299.2∼350
39512.535943.1∼350
2382.92652.4∼600
12083.11262.31233∼1700
0.51003.20.5113.10.5502.9∼2300
0.25883.20.2553.70.25173.5∼5000
534033.238943.1∼700
21063.12∼1000
14793.211803.212562.7∼1500
0.53263.20.51073.10.5714.2∼3500
0.252626.20.25883.30.25∼6000
31292.4∼1000
6612473∼2000
0.53373.60.53172.9∼4000
0.251263.60.25483.6∼10 000
Table 2.

The details of 3-D inverse process for the truncated data set containing frequencies (3, 5, 6 Hz) with the initial lowest frequency fini = 3 (Hz) and three different initial damping parameters σini = 1, 3, 6 (s−1). Note once more that sequential ordering is from the top to bottom in the table.

σini = 1 (s−1)σini = 3 (s−1)σini = 6 (s−1)
f (Hz)σ (s−1)Number ofData errorσ (s−1)Number ofData errorσ (s−1)Number ofData errorAverage number
inverse iterationsen (per cent)inverse iterationsen (per cent)inverse iterationsen (per cent)of iterations
for forward
problem solution
33611749.3∼300
41299.2∼350
39512.535943.1∼350
2382.92652.4∼600
12083.11262.31233∼1700
0.51003.20.5113.10.5502.9∼2300
0.25883.20.2553.70.25173.5∼5000
534033.238943.1∼700
21063.12∼1000
14793.211803.212562.7∼1500
0.53263.20.51073.10.5714.2∼3500
0.252626.20.25883.30.25∼6000
31292.4∼1000
6612473∼2000
0.53373.60.53172.9∼4000
0.251263.60.25483.6∼10 000
σini = 1 (s−1)σini = 3 (s−1)σini = 6 (s−1)
f (Hz)σ (s−1)Number ofData errorσ (s−1)Number ofData errorσ (s−1)Number ofData errorAverage number
inverse iterationsen (per cent)inverse iterationsen (per cent)inverse iterationsen (per cent)of iterations
for forward
problem solution
33611749.3∼300
41299.2∼350
39512.535943.1∼350
2382.92652.4∼600
12083.11262.31233∼1700
0.51003.20.5113.10.5502.9∼2300
0.25883.20.2553.70.25173.5∼5000
534033.238943.1∼700
21063.12∼1000
14793.211803.212562.7∼1500
0.53263.20.51073.10.5714.2∼3500
0.252626.20.25883.30.25∼6000
31292.4∼1000
6612473∼2000
0.53373.60.53172.9∼4000
0.251263.60.25483.6∼10 000

Undoubtedly, the use of large Laplace damping constants may essentially help the performance of FWI when low frequency data is lacking, but it cannot replace such data completely. For example, we could not obtain acceptable results using only 5 Hz data for inversion. Our simulation shows that availability of 1.5 Hz data for Marmousi-2 elastic model allows not only obtaining a better reconstruction, but also essentially reducing the number of both forward and inversion iterations (see Tables 1 and 2). In any case, it appears probable that the order of single-frequency inversions with changing damping constants, from the largest to the smallest, is optimal for the inversion scheme when imaging reflection data, and where the forward problem is solved using an iterative method, such as a Krylov solver.

Notwithstanding that the results in Figs 12(c)–(f) were obtained after the inversion of 6 Hz data, we determined that acceptable results could be achieved with just the two-frequency set (3 and 5 Hz). Using 6 Hz data brings best results, but results for a maximum frequency of 5 Hz are still very good. Figs 13(a) and (b) presents the inverted images of P and S velocities distributions obtained with only 3 and 5 Hz data, which should be compared with the result inverted using 1.5, 3, 5 and 6 Hz data and to the exact solution in Fig. 9. The Marmousi model clearly demonstrates that our algorithm based on the sequentially ordered single-frequency inversion in the Laplace–Fourier domain can successfully produce 3-D images of complex structures. As the number of frequencies used for the waveform inversion in the Laplace–Fourier domain is increased (partially at low frequencies), the computation efficiency is also increased. Because the reverse loop implementation is adopted, we can easily perform quality control on the inversion process over all frequencies.

Figure 13.

The inversion of (a) P-wave velocity (m s–1) and (b) S-wave velocity (m s–1) with the 3 and 5 Hz data.

5 CONCLUSION

We have presented a 3-D massively parallel elastic waveform inversion algorithm based on an iterative solver, which exploits damping of the wavefield for solution efficiency. In this algorithm, we use the sequentially ordered single-frequency waveform inversion in a multiscale approach. Lowest frequency data is inverted first, followed by higher frequency data. With reflection data, at a fixed frequency, the inversion is performed from the largest Laplace damping constant to the smallest one, and the final model with the smallest damping constant is used as a starting model for the next frequency. To test the model adequacy at all frequencies and prevent the cycle-skipping artefacts used in this recurrence, we allow for the option to resample some lower complex frequency data during the inversion iterations.

We showed that a truncated scheme of inversion workflow, relevant to reflection data, can be very efficiently employed for reducing the most expensive inversion iterations, using small damping constants. This scheme suggests that at higher frequencies, small damping should be used for the final stages of the process, whereas large damping be used only for the lower frequencies.

The advantages of our approach include the efficiency of the forward problem solution provided by the iterative solver under the order as stated above, its efficiency for conducting simulations of relatively modest 3-D FD grids, and the parallelization of the inverse problem resulting from domain and data decompositions.

We have presented several applications of increasing complexity to validate the algorithm on synthetic examples. Some preliminary applications to the fragment of Marmousi model suggest that FWI exploiting damped wavefields can be applied successfully to frequencies up to 6 Hz, and that it is possible to develop accurate 3-D velocity images at a resolution that approaches half the smallest wavelength utilized to image the subsurface: in the case of the Marmousi model, 150 m for a velocity of 3600 m s−1. The final velocity structure was recovered from a simple starting model, and not a smoothed version of the actual velocity model as is common in many FWI implementations. We were able to achieve acceptable results simple using a linearly increasing velocity model with depth.

It is true that our presented results were obtained for modest models and synthetic data without noise. This is because our primary purpose was to elaborate on an effective workflow that allows using FWI with the least computer expense. The next logical step will be to demonstrate the 3-D inversion scheme on large-scale models for data with noise, along with source signature definition. This will be developed in our future work.

This work was carried out at Lawrence Berkeley National Laboratory with funding provided by the U.S. Department of Energy Office of Science and Geothermal Program Office under respective contract numbers DE-AC02-05CH11231 and GT-480010-19823-10. Computational resources were provided by the National Energy Research Scientific Computing (NERSC) Center.

REFERENCES

Alumbaugh
D.L.
Newman
G.A.
Prevost
L.
Shadid
J.N.
Three dimensional, wideband electromagnetic modeling on massively parallel computers
Radio Sci.
1996
, vol. 
31
 (pg. 
1
-
23
)
Aminzadeh
F.
Brac
J.
Kunz
T.
3-D Salt and Overthrust Models: SEG/EAGE 3-D Modeling Series No. 1
1997
Society of Exploration Geophysicists
Ben-Hadj-Ali
H.
Operto
S.
Virieux
J.
Velocity model-building by 3D frequency-domain, full waveform inversion of wide aperture seismic data
Geophysics
2008
, vol. 
73
 
5
(pg. 
VE101
-
VE117
)
Brenders
A.
Pratt
R.
Full waveform tomography for lithosperic imaging: results from a blind test in a realistic crustal model
Geophys. J. Int.
2007
, vol. 
168
 (pg. 
133
-
151
)
Brossier
R.
Operto
S.
Virieux
J.
Seismic imaging of complex onshore structures by 2D elastic frequency-domain full-waveform inversion
Geophysics
2009
, vol. 
74
 (pg. 
WCC105
-
WCC118
)
Brown
B.M.
Jais
M.
Knowles
I.W.
A variational approach to an elastic inverse problem
Inverse Problems
2005
, vol. 
21
 (pg. 
1953
-
1973
)
Bunks
C.
Saleck
F.M.
Zaleski
S.
Chavent
G.
Multiscale seismic waveform inversion
Geophysics
1995
, vol. 
60
 (pg. 
1457
-
1473
)
Choi
Y.
Min
D.-J.
Shin
C.
Frequency-domain elastic full waveform inversion using the new pseudo-Hessian matrix: experience of elastic Marmousi-2 synthetic data
Bull. seism. Soc. Am.
2008
, vol. 
98
 (pg. 
2402
-
2415
)
Commer
M.
Newman
G.
New advances in three-dimensional controlled-source electromagnetic inversion
Geophys. J. Int.
2008
, vol. 
172
 (pg. 
513
-
535
)
Dey
A.K.
Stewart
R.R.
Predicting density using Vs and Gardner's relationship
CREWES Res. Rep.
1997
, vol. 
9
 
6
(pg. 
1
-
9
)
Erlangga
Y.
Nabben
R.
Multilevel projection-based nested Krylov iteration for boundary value problems
SIAM J. Scient. Comput.
2008
, vol. 
30
 (pg. 
1572
-
1595
)
Fletcher
R.
Reeves
C.M.
Function minimization by conjugate gradients
Computer J.
1964
, vol. 
7
 (pg. 
149
-
154
)
Fichtner
A.
Full Seoismic Waveform Modelling and Inverison
2011
Springer
Gardner
G.H.F.
Gardner
L.W.
Gregory
A.R.
Formation velocity and density. The diagnostic basics for stratigraphic traps
Geophysics
1974
, vol. 
39
 (pg. 
770
-
780
)
Gottschammer
E.
Olsen
K.B.
Accuracy of the explicit planar free-surface boundary condition implemented in a fourth-order staggered-grid velocity-stress finite-difference scheme
Bull. seism. Soc. Am.
2001
, vol. 
91
 (pg. 
617
-
623
)
Graves
R.W.
Simulating seismic wave propagation in 3D elastic media using staggered-grid finite differences
Bull. seism. Soc. Am.
1996
, vol. 
86
 (pg. 
1091
-
1106
)
Guasch
L.
Warner
M.
Nangoo
T.
Morgan
J.
Umpleby
A.
Stekl
I.
Shah
N.
Elastic 3D full-waveform inversion
Proceedings of the 82nd Annual International Meeting
2012
(pg. 
1
-
5
Las Vegas, SEG Technical Program, Expanded Abstracts
Habashy
T.M.
Abubakar
A.
Pen
G.
Belani
A.
Source-receiver compression scheme for full-waveform seismic inversion
Geophysics
2011
, vol. 
78
 (pg. 
R95
-
R108
)
Hastings
F.D.
Schneider
J.B.
Broschat
S.L.
Application of the perfectly matched layer (PML) absorbing boundary condition to elastic wave propagation
J. acoust. Soc. Am.
1996
, vol. 
100
 (pg. 
3061
-
3069
)
Hestholm
S.
Ruud
B.
3D free-boundary conditions for coordinate-transform finite-difference seismic modeling
Geophys. Prospect.
2002
, vol. 
50
 (pg. 
463
-
474
)
Hu
W.
Abubakar
A.
Habashy
T.M.
Simultaneous multifrequency inversion of full-waveform seismic data
Geophysics
2009
, vol. 
74
 
2
(pg. 
R1
-
R14
)
Jeong
W.
Lee
H.-Y.
Min
D.-J.
Full waveform inversion strategy for density in the frequency domain
Geophys. J. Int.
2012
, vol. 
188
 (pg. 
1221
-
1242
)
Kim
S.
Pasciak
J.E.
Analysis of a Cartesian PML approximation to acoustic scattering problems in R2
J. Math. Anal. Appl.
2010
, vol. 
370
 (pg. 
168
-
186
)
Marfurt
K.
Accuracy of finite-difference and finite-element modeling of the scalar and elastic wave equations
Geophysics
1984
, vol. 
49
 (pg. 
533
-
549
)
Martin
G.S.
Marfurt
K.J.
Larsen
S.
Marmousi-2: an updated model for the investigation of AVO in structurally complex areas
Proceedings of the 72nd Annual International Meeting
2002
SEG
(pg. 
1979
-
1982
Expanded Abstract
Newman
G.A.
Alumbaugh
D.L.
Three-dimensional massively parallel electromagnetic inversion—I. Theory
Geophys. J. Int.
1997
, vol. 
128
 (pg. 
345
-
354
)
Newman
G.A.
Alumbaugh
D.L.
Three-dimensional magnetotelluric inversion using nonlinear conjugate gradients
Geophys. J. Int.
2000
, vol. 
140
 (pg. 
410
-
424
)
Onoue
Y.
Fujino
S.
Nakashima
N.
Improved IDR(s) method for gaining very accurate solutions
World Acad. Sci., Eng. Technol.
2009
, vol. 
55
 (pg. 
520
-
525
)
Operto
S.
Ravaut
C.
Importa
L.
Virieux
J.
Dell'Aversana
P.
Quantitative imaging of complex structures from dense wide-aperture seismic data by multiscale traveltime and waveform inversions: a case study
Geophys. Prospect.
2004
, vol. 
72
 (pg. 
625
-
651
)
Operto
S.
Virieux
J.
Amestoy
P.
L'Excellent
J.-Y.
Giraud
L.
Ben-Hadj-Ali
H.
3-D finite-difference modeling of visco-acoustic wave propagation using a massively parallel direct solver: a feasibility study
Geophysics
2007
, vol. 
72
 
5
(pg. 
SM195
-
SM211
)
Petrov
P.V.
Newman
G.A.
3D finite-difference modeling of elastic wave propagation in the Laplace-Fourier domain
Geophysics
2012
, vol. 
77
 (pg. 
T137
-
T155
)
Plessix
R.-E.
Three-dimensional frequency-domain full-waveform inversion with an iterative solver
Geophysics
2009
, vol. 
74
 (pg. 
WCC149
-
WCC157
)
Polyak
E.
Ribière
G.
Note sur la convergence desméthods conjugées
Rev. Fr. Inr. Rech. Oper.
1969
, vol. 
16
 (pg. 
35
-
43
)
Pratt
R.G.
Frequency domain elastic wave modeling by finite differences: a tool for cross-hole seismic imaging
Geophysics
1990
, vol. 
55
 (pg. 
626
-
632
)
Pratt
R.G.
Seismic waveform inversion in the frequency domain. Part 1: theory and verification in a physical scale model
Geophysics
1999
, vol. 
64
 (pg. 
888
-
901
)
Pratt
R.G.
Shin
C.
Hicks
G.J.
Gauss-Newton and full Newton method in frequency domain seismic waveform inversion
Geophys. J. Int.
1998
, vol. 
133
 (pg. 
341
-
362
)
Pyun
S.
Shin
C.
Lee
H.
Yang
D.
3D elastic full waveform inversion in the Laplace domain
2008
(pg. 
1976
-
1980
SEG Technical Program Expanded Abstracts
Sirgue
L.
Pratt
R.G.
Efficient waveform inversion and imaging: a strategy for selecting temporal frequencies
Geophysics
2004
, vol. 
69
 (pg. 
231
-
248
)
Sirgue
L.
Etgen
J.
Albertin
U.
3-D frequency-domain waveform inversion using time-domain finite-difference methods
Proceedings of the 70th Conference & Technical Exhibition
2008
EAGE
 
Extended Abstracts
 
F022
Sheen
D.-H.
Tuncay
K.
Baag
C.-E.
Ortoleva
P.J.
Time domain Gauss-Newton seismic waveform inversion in elastic media
Geophys. J. Int.
2006
, vol. 
167
 (pg. 
1373
-
1384
)
Shin
C.
Cha
Y.H.
Waveform inversion in the Laplace domain
Geophys. J. Int.
2008
, vol. 
173
 (pg. 
922
-
931
)
Shin
C.
Cha
Y.
Waveform inversion in the Laplace-Fourier domain
Geophys. J. Int.
2009
, vol. 
177
 (pg. 
1067
-
1079
)
Shin
C.
Min
D.-J.
Waveform inversion using a logarithmic wavefield
Geophysics
2006
, vol. 
71
 (pg. 
R31
-
R42
)
Shin
C.
Koo
N.
Cha
Y.H.
Park
K.
Sequentially ordered single-frequency 2-D acoustic waveform inversion in the Laplace–Fourier domain
Geophys. J. Int.
2010
, vol. 
181
 (pg. 
935
-
950
)
Shipp
R.M.
Singh
S.C.
Two-dimensional full wavefield inversion of wide-aperture marine seismic streamer data
Geophys. J. Int.
2002
, vol. 
151
 (pg. 
325
-
344
)
Sonneveld
P.
van Gijzen
M.B.
IDR(s): a family of simple and fast algorithms for solving large nonsymmetrical systems of linear equations
SIAM J. Sci. Comp.
2008
, vol. 
31
 (pg. 
1035
-
1062
)
Tarantola
A.
Inversion of seismic reflection data in the acoustic approximation
Geophysics
1984
, vol. 
49
 (pg. 
1259
-
1266
)
Tarantola
A.
A strategy for nonlinear elastic inversion of seismic reflection data
Geophysics
1986
, vol. 
51
 (pg. 
1893
-
1903
)
Virieux
J.
P-SV wave propagation in heterogeneous media: velocity-stress finite-difference method
Geophysics
1986
, vol. 
51
 (pg. 
889
-
901
)
Virieux
J.
Operto
S.
An overview of full-waveform inversion in exploration geophysics
Geophysics
2009
, vol. 
74
 (pg. 
WCC127
-
WCC152
)
Wang
S.
de Hoop
M.
Xia
J.
3D frequency domain full waveform inversion via a massively parallel structured multifrontal solver
2012a
(pg. 
1
-
5
SEG Technical Program Expanded Abstracts
Wang
S.
de Hoop
M.
Xia
J.
Massively parallel structured multifrontal solver for time-harmonic elastic waves in 3-D anisotropic media
Geophys. J. Int.
2012b
, vol. 
191
 (pg. 
346
-
366
)
Warner
M.
Stekl
I.
Umpleby
A.
Efficient and effective 3-D wavefield tomography
Proceedings of the EAGE, 70th Conference & Technical Exhibition
2008
 
Rome, Italy, Extended Absracts, F023
Warner
M.
, et al. 
Anisotropic 3D full-waveform inversion
Geophysics
2013
, vol. 
78
 (pg. 
R59
-
80
)
Xiong
J.L.
Lin
Y.
Abubakar
A.
Habashy
T.M.
2.5-D forward and inverse modelling of full-waveform elastic seismic survey
Geophys. J. Int.
2013
, vol. 
193
 (pg. 
938
-
948
)

APPENDIX A

Presented in the eq. (6) FD operators Dτ; Dv can be expressed as block matrices:
\begin{equation} {\bf D}_v = s_k^{ - 1} \left( {\begin{array}{ccc} {{\bf \tilde D}_{\bf x} } &\quad {{\bf \tilde D}_{\bf y} } &\quad {{\bf \tilde D}_{\bf z} } \\ {{\bf D}_{\bf y} } &\quad {{\bf D}_{\bf x} } &\quad {\bf 0} \\ {{\bf D}_{\bf z} } &\quad {\bf 0} &\quad {{\bf D}_{\bf x} } \\ {{\bf \tilde D}_{\bf x} } &\quad {{\bf \tilde D}_{\bf y} } &\quad {{\bf \tilde D}_{\bf z} } \\ {\bf 0} &\quad {{\bf D}_{\bf z} } &\quad {{\bf D}_{\bf y} } \\ {{\bf \tilde D}_{\bf x} } &\quad {{\bf \tilde D}_{\bf y} } &\quad {{\bf \tilde D}_{\bf z} } \\ \end{array}} \right)\,\;{\bf D}_\tau = s_k^{ - 1} \left( {\begin{array}{cccccc} {{\bf D}_{\bf x} } &\quad {{\bf \tilde D}_{\bf y} } &\quad {{\bf \tilde D}_{\bf z} } &\quad {\bf 0} &\quad {\bf 0} &\quad {\bf 0} \\ {\bf 0} &\quad {{\bf \tilde D}_{\bf x} } &\quad {\bf 0} &\quad {{\bf D}_{\bf y} } &\quad {{\bf \tilde D}_{\bf z} } &\quad {\bf 0} \\ {\bf 0} &\quad {\bf 0} &\quad {{\bf \tilde D}_{\bf x} } &\quad {\bf 0} &\quad {{\bf \tilde D}_{\bf y} } &\quad {{\bf D}_{\bf z} } \\ \end{array}} \right). \end{equation}
(A1)
Operator Dτ is applied to the stress components τ =(τxx, τxy, τxz, τyy, τyz, τzz)T and Dv to the velocity components vq=(vx, vy, vz)T, sk is a complex frequency, |${\bf \tilde D}_\alpha ,{\bf D}_\alpha ,\;\left( {\alpha = x,y,z} \right)$| are the matrices that realize the FD operators for the first derivatives. The explicit expressions can be found in (Petrov & Newman 2012). Block diagonal matrices 〈κ〉, 〈μ〉, 〈μαβ, 〈bα are the averaged values of the corresponding parameters of the media. Specifically,
\begin{equation} \left\langle {\bf b} \right\rangle = \left( {\begin{array}{ccc} {\left\langle {\bf b} \right\rangle ^{\bf x} } &\quad {\bf 0} &\quad {\bf 0} \\ {\bf 0} &\quad {\left\langle {\bf b} \right\rangle ^{\bf y} } &\quad {\bf 0} \\ {\bf 0} &\quad {\bf 0} &\quad {\left\langle {\bf b} \right\rangle ^{\bf z} } \\ \end{array}} \right),\;\left\langle {{\bf k\mu }} \right\rangle = \left( {\begin{array}{ccc} {\left\langle {\bf k} \right\rangle + \frac{4}{3}\left\langle {\bf \mu } \right\rangle } &\quad {\left\langle {\bf k} \right\rangle - \frac{2}{3}\left\langle {\bf \mu } \right\rangle } &\quad {\left\langle {\bf k} \right\rangle - \frac{2}{3}\left\langle {\bf \mu } \right\rangle } \\ {\left\langle {\bf \mu } \right\rangle ^{xy} } &\quad {\left\langle {\bf \mu } \right\rangle ^{xy} } &\quad 0 \\ {\left\langle {\bf \mu } \right\rangle ^{xz} } &\quad 0 &\quad {\left\langle {\bf \mu } \right\rangle ^{xz} } \\ {\left\langle {\bf k} \right\rangle - \frac{2}{3}\left\langle {\bf \mu } \right\rangle } &\quad {\left\langle {\bf k} \right\rangle + \frac{4}{3}\left\langle {\bf \mu } \right\rangle } &\quad {\left\langle {\bf k} \right\rangle - \frac{2}{3}\left\langle {\bf \mu } \right\rangle } \\ 0 &\quad {\left\langle {\bf \mu } \right\rangle ^{yz} } &\quad {\left\langle {\bf \mu } \right\rangle ^{yz} } \\ {\left\langle {\bf k} \right\rangle - \frac{2}{3}\left\langle {\bf \mu } \right\rangle } &\quad {\left\langle {\bf k} \right\rangle - \frac{2}{3}\left\langle {\bf \mu } \right\rangle } &\quad {\left\langle {\bf k} \right\rangle + \frac{4}{3}\left\langle {\bf \mu } \right\rangle } \\ \end{array}} \right). \end{equation}
(A2)
Matrices elements can be expressed as per (Petrov & Newman 2012):
\begin{eqnarray} \left\langle \kappa \right\rangle _{ll} &=& \left\langle {\frac{1}{\kappa }} \right\rangle _{i,j,k}^{ - 1} = \left( {\frac{1}{{8\hat h_{x,i} \hat h_{y,j} \hat h_{z,k} }}\sum\limits_{\scriptstyle l = 0,1 \atop {\scriptstyle p = 0,1 \atop \scriptstyle q = 0,1 }} {\frac{{h_{x,i + l} h_{y,j + p} h_{z,k + q} }}{{\kappa _{i - 1/2 + l,j - 1/2 + p,k - 1/2 + q} }}} } \right)^{ - 1} ,\;l = i + (j - 1)N_x + \left( {k - 1} \right)N_y ; \nonumber\\ l &=& 1\ldots N,\;\;N = N_x N_y N_z ; \nonumber\\ h_{x,i} &=& x_i - x_{i = 1} ,\;h_{y,j} = y_j - y_{j - 1} ,\;h_{z,k} = z_k - z_{k - 1} \nonumber\\ \hat h_{x,i} &=& x_{i + 1/2} - x_{i - 1/2} ,\;\hat h_{y,j} = y_{j + 1/2} - y_{j - 1/2} ,\;\hat h_{z,k} = z_{k + 1/2} - z_{k - 1/2} \end{eqnarray}
(A3)
\begin{equation} \left\langle {\bf \mu } \right\rangle _{l,l} = \left\langle {\frac{1}{\mu }} \right\rangle _{i,j,k}^{ - 1} = \left( {\frac{1}{{8\hat h_{x,i} \hat h_{y,j} \hat h_{z,k} }}\sum\limits_{\scriptstyle l = 0,1 \atop {\scriptstyle p = 0,1 \atop \scriptstyle q = 0,1 }} {\frac{{h_{x,i + l} h_{y,j + p} h_{z,k + q} }}{{\mu _{i - 1/2 + l,j - 1/2 + p,k - 1/2 + q} }}} } \right)^{ - 1} , \end{equation}
(A4)
\begin{equation} \begin{array}{llll} {\left\langle {\bf \mu } \right\rangle _{l,l}^{xy} = \left( {\frac{{\bf 1}}{{2\hat h_{z,k} }}\sum\limits_{q = 0,1} {\frac{{{\boldsymbol h}_{z,k + q} }}{{\mu _{i - 1/2,j - 1/2,k - 1/2 + q} }}} } \right)^{ - 1} ,\left\langle {\bf \mu } \right\rangle _{l,l}^{yz} = \left( {\frac{1}{{2\hat h_{x,i} }}\sum\limits_{q = 0,1} {\frac{{h_{x,i + q} }}{{\mu _{i - 1/2 + q,j - 1/2,k - 1/2} }}} } \right)^{ - 1} ,} \\ {\left\langle {\bf \mu } \right\rangle _{l,l}^{xz} = \left( {\frac{\bf 1}{{{\bf 2}\hat {\boldsymbol h}_{y,j} }}\sum\limits_{q = 0,1} {\frac{{{\boldsymbol h}_{y,j + q} }}{{\mu _{i - 1/2,j - 1/2 + q,k - 1/2} }}} } \right)^{ - 1} ,} \\ \end{array} \end{equation}
(A5)
\begin{equation} \begin{array}{llll} {\left\langle {\bf b} \right\rangle _{ll}^x = \frac{{4\hat h_{y,j} \hat h_{z,k} }}{{\sum\limits_{\scriptstyle p = 0,1 \atop \scriptstyle q = 0,1 } {\rho _{i - 1/2,j - 1/2 + p,k - 1/2 + q} h_{y,j + p} h_{z,k + q} } }},\;\left\langle {\bf b} \right\rangle _{ll}^y = \frac{{4\hat h_{x,i} \hat h_{z,k} }}{{\sum\limits_{\scriptstyle p = 0,1 \atop \scriptstyle q = 0,1 } {\rho _{i - 1/2 + p,j - 1/2,k - 1/2 + q} h_{x,i + p} h_{z,k + q} } }},} \\ {\left\langle {\bf b} \right\rangle _{ll}^z = \frac{{4\hat h_{x,i} \hat h_{y,j} }}{{\sum\limits_{\scriptstyle p = 0,1 \atop \scriptstyle q = 0,1 } {\rho _{i - 1/2 + p,j - 1/2 + q,k - 1/2} h_{x,i + p} h_{y,j + q} } }}.} \\ \end{array} \end{equation}
(A6)