Block-matching algorithm based on differential evolution for motion estimation

https://doi.org/10.1016/j.engappai.2012.08.003Get rights and content

Abstract

Motion estimation is one of the major problems in developing video coding applications. Among all motion estimation approaches, block-matching (BM) algorithms are the most popular methods due to their effectiveness and simplicity for both software and hardware implementations. A BM approach assumes that the movement of pixels within a defined region of the current frame (macro block, MB) can be modeled as a translation of pixels contained in the previous frame. In this procedure, the motion vector is obtained by minimizing the sum of absolute differences (SAD) produced by the MB of the current frame over a determined search window from the previous frame. The SAD evaluation is computationally expensive and represents the most consuming operation in the BM process. The most straightforward BM method is the full search algorithm (FSA), which finds the most accurate motion vector, exhaustively calculating the SAD values for all the elements of the search window. Over this decade, several fast BM algorithms have been proposed to reduce the number of SAD operations by calculating only a fixed subset of search locations at the cost of poor accuracy. In this paper, a new algorithm based on differential evolution (DE) is proposed to reduce the number of search locations in the BM process. To avoid computing several search locations, the algorithm estimates the SAD values (fitness) for some locations using the SAD values of previously calculated neighboring positions. As the proposed algorithm does not consider any fixed search pattern or any other different assumption, a high probability for finding the true minimum (accurate motion vector) is expected. In comparison with other fast BM algorithms, the proposed method deploys more accurate motion vectors, yet delivering competitive time rates.

Introduction

Virtually, all applications of video and visual communication deal with an enormous amount of data. The limited storage capacity and transmission bandwidth available has made digital video coding an important technology. In video coding, the high correlation between successive frames can be exploited to improve coding efficiency, which is usually achieved by using motion estimation (ME). Many ME methods have been studied in an effort to reduce the complexity of video coding, such as block-matching (BM) algorithms, parametric-based models (Dimitrios Tzovaras and Michael, 1999), optical flow (Barron et al., 1994), and pel-recursive techniques (Skowronski, 1999). Among these methods, BM seems to be the most popular method due to its effectiveness and simplicity for both software and hardware implementations. BM is also widely adopted by various video coding standards, such as MPEG1 (1993), MPEG2 (1994), MPEG4 (2000), H261 (1993), and I.-T.R., H.263 (2000).

In a BM algorithm, the current frame is divided into non-overlapping macro blocks (MB) of N×N pixel dimension. For each block, in the current frame, the best-matched block within a search window of size (2W+1)×(2W+1) in the previous frame is determined, where W is the maximum allowed displacement. The position difference between a template block in the current frame and the best-matched block in the previous frame is called the motion vector. A commonly used matching measure is the sum of absolute differences (SAD), which is computationally expensive and represents the most consuming operation in the BM process.

The full search algorithm (FSA) (Jain and Jain, 1981) is the simplest BM algorithm that can deliver the optimal estimation solution regarding the minimal matching error, because it checks all candidates one at a time. However, such exhaustive search and full-matching error calculation at each checking point yields an extremely computational-expensive FSA method that seriously constraints real-time video applications.

To decrease the computational complexity of the BM process, several BM algorithms have been proposed, which are based on the following three techniques: (1) Using a fixed pattern, which means that the search operation is conducted on a fixed subset of the total search window, and some famous examples include, the three step search (TSS) (Jong et al., 1994), the new three step search (NTSS) (Renxiang Li et al., 1994), the simple and efficient TSS (SES) (Jianhua Lu and Liou, 1997), the four step search (4SS) (Lai-Man and Wing-Chung, 1996), and the diamond search (DS) (Shan and Kai-Kuang, 2000). Such approaches have been algorithmically considered as the fastest. However, they are eventually not able to match the dynamic motion-content delivering false motion vectors (image distortions). (2) Reducing the search points signifies that the algorithm chooses only such locations as search points, which iteratively minimize the error-function (SAD values). In this category, some examples include the adaptive rood pattern search (ARPS) (Yao Nie and Ma, 2002), the fast block matching using prediction (FBMAUPR) (Yi-Ching et al., 2009), the block-based gradient descent search (BBGD) (Liu and Feig, 1996), and the neighbourhood elimination algorithm (NE) (Saha et al., 2011). These approaches assume that the error-function behaves monotonically, which holds well for slow-moving sequences; however, such properties do not hold true for other kind of movements in video sequences (Chow and Liou, 1993), yielding that the algorithms may get trapped into local minima. (3) Decreasing the computational overhead for each search point means that the matching cost (SAD operation) is replaced by a partial or simplified version with less complexity. Some examples of this category are new pixel-decimation (ND) (Saha et al., 2008), the efficient block matching using multilevel intra and inter-sub-blocks (Renxiang Li et al., 1994), and the successive elimination algorithm (Song et al., 2007). These techniques are based on the assumption that all pixels within each block move by the same amount, while a good estimate of the motion could be obtained by using only a fraction of the pixels. However, as only a fraction of the pixels enters into the matching computation, the use of these regular sub-sampling techniques can seriously affect the accuracy of the detection of motion vectors due to noise or illumination changes.

Alternatively, evolutionary approaches, such as genetic algorithms (GA) (Holland, 1975) and particle swarm optimization (PSO) (Kennedy and Eberhart, 1995) are well known for locating potential global optimum within an arbitrary search space. However, only a few evolutionary approaches have specifically addressed the problem of BM, such as the light-weight genetic block matching (LWG) (Chun-Hung and Ja-Ling, 1998), the genetic four-step search (GFSS) (Wu and So, 2003), and the PSO-BM (Yuan and Shen, 2008). Although these methods support an accurate identification of the motion vector, their spending times are very long, when compared with other BM techniques.

Differential evolution (DE), introduced by Storn and Price (1995), is a novel evolutionary algorithm, which is used to optimize complex continuous nonlinear functions. As a population-based algorithm, DE uses simple mutation and crossover operators to generate new candidate solutions, and applies one-to-one competition scheme to greedily decide whether the new candidate or its parent will survive in the next generation. Owing to its simplicity, ease of implementation, fast convergence, and robustness, the DE algorithm has gained much attention, reporting a wide range of successful applications in the literature (Babu and Munawar, 2007, Mayer et al., 2005, Kannan et al., 2003, Chiou et al., 2005, Chiou et al., 2004, Ursem and Vadstrup, 2003, Babu et al., 2003, Zelinka et al., 2008, Cuevas et al., 2010).

For many real-world applications, the number of calls to the objective function needs to be limited, e.g., because an evaluation is very time consuming or expensive, or the approach requires user interaction. However, DE does not seem to have such problems, because it usually requires many evaluations before producing a satisfying result.

The problem of excessively long fitness function calculations has already been faced in the field of evolutionary algorithms (EA), which is known as evolution control (Jin, 2005). In an evolution control approach, the idea is to replace the costly objective function evaluation for some individuals by fitness estimates, based on an approximate model of the fitness landscape. The individuals to be evaluated, and those to be estimated, are determined based on some fixed criteria, which depend on the specific properties of the used approximate model (Yaochu, 2011). The models, used in the estimation, can be built during the actual EA run, because EA repeatedly sample the search space at different points (Branke and Schmidt, 2005). There are certainly many possible approximation models, and several have already been used in combination with EA (e.g., polynomials (Zhou et al., 2005), the Kriging model (Ratle, 2001), the feedforward neural networks, including multi-layer perceptrons (Lim et al., 2010), and radial basis-function networks (Ong et al., 2008)). These models can be either global, which make use of all available data, or local, which make use of only a small set of data around the point where the function is to be approximated. Local models, however, have a number of advantages (Branke and Schmidt, 2005): they are well-known and established techniques, relatively fast, and take into account the intuitively most important information, the closest neighbors.

In this work, a new algorithm based on DE is proposed to reduce the number of search locations in the BM process. The algorithm uses a simple fitness calculation approach, which is based on the nearest neighbor interpolation (NNI) algorithm to estimate the fitness value (SAD operation) for several candidate solutions (search locations). As a result, the approach can substantially reduce the number of function evaluations, preserving the good search capabilities of DE. In comparison with other fast BM algorithms, the proposed method deploys more accurate motion vectors, yet delivering competitive time rates.

The paper is organized as follows: Section 2 holds a brief description about the differential evolution algorithm. In Section 3, the fitness calculation strategy for solving the expensive optimization problem is presented. Section 4 provides the background about the BM motion estimation issue, while Section 5 exposes the final BM algorithm as a combination of DE and the NNI estimator. Section 6 demonstrates the experimental results for the proposed approach over standard test sequences, and some conclusions are discussed in Section 7.

Section snippets

DE algorithm

The DE algorithm is a simple and direct search algorithm, which is based on population and aims for optimizing global multi-modal functions. DE employs the mutation operator to provide the exchange of information among several solutions.

There are various mutation-based generators to define the algorithm type. The version of DE algorithm used in this study is known as DE/best/l/exp or “DE1” (Storn and Price, 1995). DE algorithms begin by initializing a population of Np and D-dimensional vectors,

Fitness approximation method

EA based on fitness approximation aim to find the global minimum of a given function, considering only a very few number of function evaluations. To apply such approach, it is necessary that the objective function portraits the following conditions (Luoa et al., 2011): (1) it must be very costly to evaluate and (2) it must have few dimensions (up to five). Recently, several fitness estimators have been reported in the literature (Zhou et al., 2005, Ratle, 2001, Lim et al., 2010, Ong et al., 2008

Motion estimation and BM

For motion estimation, in a BM algorithm, the current frame of an image sequence It is divided into non-overlapping blocks of N×N pixels. For each template block in the current frame, the best-matched block within a search window of size (2W+1)×(2W+1) in the previous frame It1 is determined, where W is the maximum allowed displacement. The position difference between a template block in the current frame and the best-matched block in the previous frame is called the motion vector (MV) (see

BM algorithm based on DE with the estimation strategy

FSA finds the global minimum (the accurate MV), considering all locations within the search space S. Nevertheless, this approach has a high computational cost for practical use. To overcome such a problem, many fast algorithms have been developed, despite the fact that their precision is poorer than the FSA. A better BM algorithm should spend less computation time on searching and obtaining accurate MVs.

The BM algorithm, proposed in this study, has the velocity of the fastest algorithms and a

Experimental results

This section presents the results of the comparison of the proposed DE-BM algorithm with other existing BM algorithms. The simulations have been performed over the luminance component of popular video sequences listed in Table 1. Such sequences consist of different degrees and types of motion, including QCIF (176×144), CIF (352×288), and SIF (352×240). The first four sequences are Container, Carphone, Foreman, and Akiyo in QCIF format. The next two sequences are Stefan in CIF format and Tennis

Conclusions

In this paper, a new algorithm based on DE has been proposed to reduce the number of search locations in the BM process. To save computational time, the approach combines the traditional DE with a fitness estimation strategy to decide which search locations (individuals) can be estimated or actually evaluated. As a result, the approach has substantially reduced the number of function evaluations, yet preserving the good search capabilities of DE.

The proposed fitness calculation strategy

References (47)

  • J. Chiou et al.

    Variable scaling hybrid differential evolution for solving network reconfiguration of distribution systems

    IEEE Trans. Power Syst.

    (2005)
  • K.H.K. Chow et al.

    Generic motion search algorithm for video compression

    IEEE Trans. Circuits Syst. Video Technol.

    (1993)
  • L. Chun-Hung et al.

    A lightweight genetic block-matching algorithm for video coding

    IEEE Trans. Circuits Syst. Video Technol.

    (1998)
  • Ioannis Kompatsiaris Dimitrios Tzovaras et al.

    3D object articulation and motion estimation in model-based stereoscopic videoconference image sequence analysis and coding

    Signal Process. Image Commun.

    (1999)
  • D.E. Goldberg

    Genetic Algorithms in Search, Optimization and Machine Learning

    (1989)
  • J.H. Holland

    Adaptation in Natural and Artificial Systems

    (1975)
  • H261, Video Codec for Audio visual Services at px64kbit/s, ITU-T SG15, ITU-TRec.H.261, second...
  • I.-T.R., H.263, Video Coding for Low Bit Rate Communication, ITU-T SG16, ITU-TRec.H.263, third ed.,...
  • J. Kennedy, R.C. Eberhart, Particle swarm optimization, in: Proceedings of the 1995 IEEE International Conference on...
  • Jianhua Lu et al.

    A simple and efficient search algorithm for block-matching motion estimation

    IEEE Trans. Circuits Syst. Video Technol.

    (1997)
  • J.R. Jain et al.

    Displacement measurement and its application in interframe image coding

    IEEE Trans. Commun.

    (1981)
  • Y. Jin

    Comprehensive survey of fitness approximation in evolutionary computation

    Soft Comput.

    (2005)
  • H.-M. Jong et al.

    Accuracy improvement and cost reduction of 3-step search block matching algorithm for video coding

    IEEE Trans. Circuits Syst. Video Technol.

    (1994)
  • Cited by (51)

    • GGWO: Gaze cues learning-based grey wolf optimizer and its applications for solving engineering problems

      2022, Journal of Computational Science
      Citation Excerpt :

      Some proposed improvements of the DE algorithm are JADE [38], SHADE [39], PaDE [40], and MTDE [41]. These algorithms have also been used to solve real-world optimization problems [42,43]. The physics-based algorithms are inspired by physical laws, including the light refraction law, molecular dynamics, gravitational force, and inertia force.

    • Nature inspired algorithms (NIA) for efficient video compression – A brief study

      2020, Engineering Science and Technology, an International Journal
    • A hybrid block-based motion estimation algorithm using JAYA for video coding techniques

      2019, Digital Signal Processing: A Review Journal
      Citation Excerpt :

      To validate the efficacy of the proposed approach for ME, exhaustive simulations are performed in MATLAB 2017a with Core-i7 processor, 16GB RAM, Windows 10 operating system, and under certain relevant test environment as discussed below. To appraise and compare the effectiveness of the proposed ME approach, various search algorithms, namely, FSA [5], TSS [6], 4SS [9], NTSS [7], BBGD [13], DS [10], NE [14], ND [16], LWG [35], GFSS [36], PSO-BM [38], ABC-BM [49], PATTERN-ABC(HEX / DIA / SQU) [50], PBPSO [39], HS-BM [51], DE-BM [37], Test Zone Search (TZS) [52], HGS [28], DGS [29], M12 [24], M14 [25], ASRFME [26], DS [53], HEXBS [54], HMDS [55], and (MPBHS/ MPDS) [56] are considered as the benchmarks. Experiments are carried out with some of the standard and widely used video sequences, namely, the Foreman, Carphone, Akiyo, Container, Football, Stefan, Johnny, Kristen-Sara, Crowd-Run, and Old-Town-Cross.

    • Agent-based game theoretic model for block motion estimation and its multicore implementation

      2018, Swarm and Evolutionary Computation
      Citation Excerpt :

      Recently, biologically-inspired optimization algorithms have gained a lot of attention and researches have been trying to apply them to ME. Many variants of particle swarm optimization (PSO) [8–11], artificial bee colony optimization (ABCO) [12], and differential evolution (DE) [13] were used for locating potential global optimum within an arbitrary search space. These algorithms are population-based and mimic the swarming behavior of flocks of birds and herds of fish adapting to their environment.

    View all citing articles on Scopus
    View full text