Fast reference frame selection based on content similarity for low complexity HEVC encoder

https://doi.org/10.1016/j.jvcir.2016.07.018Get rights and content

Highlights

  • A fast reference frame selection algorithm for the HEVC encoder is proposed.

  • A relationship between the content similarity and reference frame selection is derived.

  • The content similarity is studied without any extra computational complexity.

  • Experimental results show that the proposed algorithm efficiently removes the encoding complexity of the best reference frame decision process.

Abstract

The high efficiency video coding (HEVC) is the state-of-the-art video coding standard, which achieves about 50% bit rate saving while maintaining the same visual quality as compared to the H.264/AVC. This achieved coding efficiency benefits from a set of advanced coding tools, such as the multiple reference frames (MRF) based interframe prediction, which efficiently improves the coding efficiency of the HEVC encoder, while it also increases heavy computation into the HEVC encoder. The high encoding complexity becomes a bottleneck for the high definition videos and HEVC encoder to be widely used in real-time and low power multimedia applications. In this paper, we propose a content similarity based fast reference frame selection algorithm for reducing the computational complexity of the multiple reference frames based interframe prediction. Based the large content similarity between the parent prediction unit (Inter_2N × 2N) and the children prediction units (Inter_2N × N, Inter_N × 2N, Inter_N × N, Inter_2N × nU, Inter_2N × nD, Inter_nL × 2N, and Inter_nR × 2N), the reference frame selection information of the children prediction units are obtained by learning the results of their parent prediction unit. Experimental results show that the proposed algorithm can reduce about 54.29% and 43.46% MRF encoding time saving for the low-delay-main and random-access-main coding structures, respectively, while the rate distortion performance degradation is negligible.

Introduction

With the development of capture and display technologies, the full high definition (HD) and ultra HD videos are attracting more and more people’s attention since they can provide higher perception/video quality. However, with the increased video resolution and frame rate, the data volume of the raw HD videos increases dramatically. It is highly desirable to develop high compression techniques due to the current memory and channel bandwidth are still limited. Under this kind of compression rate demand, the joint collaborative team on video coding (JCT-VC) of the ITU-T video coding experts group (VCEG) and ISO/IEC moving picture experts group (MPEG) has developed a state-of-the-art video coding standard named high efficiency video coding (HEVC) [1], [2], [3]. The HEVC can achieve the same subjective visual quality as the H.264/AVC [4] high profile while requiring only about 50% of the bit rate. This obtained coding efficiency benefits from a set of advanced coding tools, such as flexible size unit representation, intraframe prediction with 35 modes, multiple reference frames (MRF) interframe prediction, new in-loop filtering methods, and so on. Meanwhile, the computational complexity of the HEVC encoder increases dramatically as these used coding tools. The high computational complexity becomes a bottleneck for the HD videos and HEVC encoder to be widely used in real-time and low power multimedia applications, such as live video broadcasting, mobile video communication, and video surveillance. Thus, there is a pressing need to reduce the computational complexity of the HEVC encoder.

Recently, many researchers have devoted their efforts on reducing the computational complexity of the HEVC encoder [5], [6], [7], [8], [9], [10], [11]. Based on the Bayesian decision theory and rate distortion (RD) characteristics, Lee et al. proposed a fast coding unit (CU) size decision method for the HEVC [5]. In [6], Shen et al. proposed a CU depth decision method based on the depth selection correlation between the spatial-temporal neighboring CUs and the current CU. Besides, they also proposed an early termination for the motion estimation based on the motion homogeneity, RD cost and Skip mode. By using the CU depth selection information of spatial neighboring CUs, Kim et al. proposed a CU depth range decision method for the HEVC [7]. In [8], the RD-complexity characteristics of the inter prediction was analyzed and derived an efficient inter mode decision method for the HEVC. Based on the CU motion activity and mode selection correlation among hierarchical depth CUs, Pan et al. proposed an early Merge mode decision method for the HEVC fast interframe prediction [9]. By utilizing the estimated optical flow of the downsampled frames, Xiong et al. proposed a fast inter CU selection method for the HEVC [10]. Based on the prediction mode and RD cost correlations among different quadtree depth levels and spatially neighboring CUs, Shen et al. proposed a fast CU size and intra mode decision method for the HEVC [11]. These methods mainly focus on reducing the computational complexity of the flexible size unit representation technique, the HEVC encoding complexity could be further reduced by optimizing the MRF based interframe prediction.

In the last decade, a number of methods have been proposed to reduce the encoding complexity of the MRF based interframe prediction for the H.264/AVC and its extensions [12], [13], [14], [15], [16], [17], [18]. By taking into account the correlation/continuity of motion vectors among different reference frames, Su et al. proposed a fast MRF based motion estimation for the H.264/AVC [12]. Based on the spatial and temporal correlation of the reference frame index and motion vectors, Jun et al. proposed an efficient priority-based MRF selection method for the H.264/AVC fast motion estimation [13]. In [14], Chen et al. proposed a fast MRF based motion estimation for the H.264/AVC by using the stored motion vectors to compose the motion vector without searching all active reference frames. In [15], Liu et al. proposed a fast MRF selection method for the H.264/AVC motion estimation by using the motion activity and Hadamard coefficients. Based on the reference frame selection of the 16 × 16 mode partition, Zhang et al. proposed an efficient MRF selection method for the H.264 based multiview video coding [16]. In [17], Yeh et al. proposed a fast mode decision based MRF selection for H.264 based multiview video coding system by using inter-view rate distortion prediction method. In [18], by using the inter-view and inter-component correlations based fast mode decision, Lei et al. proposed a low complexity MRF decision method for H.264 based multiview depth video coding. These methods can efficiently reduce the computational complexity, however, they were proposed for the H.264, and are not suitable for directly applying into the HEVC encoder due to the different statistical characteristics and different coding tools used in HEVC encoding system. In [19], according to the motion complexity which is computed by the distribution of the best reference frame, the motion vector difference and its associated average distortion, a fast reference frame selection was proposed. However, the MRF encoding time saving of that method is still limited and unstable for the HEVC with random-access-main coding structure due to using of the IBP prediction structure.

In this paper, we propose a fast MRF selection algorithm for the fast HEVC interframe prediction, which is based on the relationship between the content similarity and the reference frame selection. The rest of this paper is organized as follows. The review on the HEVC MRF encoding process is presented in Section 2. Then, the details of the proposed fast MRF selection algorithm are illustrated in Section 3. Section 4 shows the experimental results. Then, an algorithm discussion is given in Section 5. At last, Section 6 concludes this paper.

Section snippets

Review on the HEVC MRF encoding process

As previously video coding standards such as H.264/AVC, the HEVC standard is also a hybrid video encoder. In the HEVC encoding process, each frame is partitioned into a sequence of coding tree units (CTUs), which is the basic unit of coding, and consists of a luma coding tree block (CTB), two chroma CTBs and associated syntax elements of 4:2:0 color sampling. According to the quadtree syntax, the CTU is further split into one or multiple CUs. Then, based on the prediction-type, the CU can be

Encoding complexity analysis on the MRF encoding process

In order to analyze the encoding complexity of the MRF selection process, eight HEVC test sequences (BQSquare, BasketballPass, BQMall, BasketballDrill, FourPeople, Johnny, Cactus, and ParkScene) with various resolutions and motion activities are encoded by the HM12.0 [21] under the HEVC common test conditions [22]. Four quantization parameters (QPs) (22, 27, 32, and 37) are used. The motion estimation search range and method are 64 and TZSearch, respectively. The low-delay-main and

Experimental results

To evaluate the coding performance of the proposed fast MRF selection algorithm, the HEVC reference software HM12.0 is used as the software platform. The hardware platform is Intel Xeon CPU E5-1620 v2 @ 3.70 GHz, 16.0 GB RAM with the Microsoft Windows 7 64-bit operating system. To compare the coding performance in terms of BDPSNR, BDBR, total encoding time saving (TS for short), total reference frame encoding time saving (RTS for short), three 416 × 240 sequences with Class D (BQSquare,

Algorithm discussion

It is well known that when designing the fast algorithms for video coding, the decision accuracy of the proposed algorithm is highly correlated with the coding efficiency. In other words, if the decision accuracy is large and close to 100%, there would be no RD performance degraded; on the contrary, if the decision accuracy is small, the RD performance would be degraded. From Table 2, we can see that the average decision accuracy of the proposed algorithm is 63.26% and 67.74% for the

Conclusion

The MRF encoding process consumes about 70% of total encoding time of an HEVC encoder. To reduce the computational complexity of the MRF encoding process, an early reference frame decision algorithm is proposed in this paper. Since there is high video content similarity between the parent PU and children PUs, the reference frame information including inference frame index and reference frame direction of the children PUs is set according to the parent PU has. Experimental results show that the

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61501246, Grant 61271324, Grant 61471348, Grant 61232016, in part by the Natural Science Foundation of Jiangsu Province of China under Grant BK20150930, in part by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant 15KJB510019, in part by the Natural Science Foundation of Hebei Province of China under Grant F2015202311, in part by the Project through the

References (28)

  • H. Wang et al.

    Early detection of all-zero 4 × 4 blocks in high efficiency video coding

    J. Visual Commun. Image Represent.

    (2014)
  • J. Chen et al.

    Parallel fast inter mode decision for H.264/AVC encoding

    J. Visual Commun. Image Represent.

    (2013)
  • P. Hanhart et al.

    Calculation of average coding efficiency based on subjective quality scores

    J. Visual Commun. Image Represent.

    (2014)
  • Information technology - High efficiency coding and media delivery in heterogeneous environments - Part 2: High...
  • G.J. Sullivan et al.

    Overview of the high efficiency video coding (HEVC) standard

    IEEE Trans. Circ. Syst. Video Technol.

    (2012)
  • J. Lee et al.

    A fast CU size decision algorithm for HEVC

    IEEE Trans. Circ. Syst. Video Technol.

    (2014)
  • L. Shen et al.

    An effective CU size decision method for HEVC encoders

    IEEE Trans. Multimedia

    (2013)
  • D.-H. Kim et al.

    Selective CU depth range decision algorithm for HEVC encoder

  • J. Vanne et al.

    Efficient mode decision schemes for HEVC inter prediction

    IEEE Trans. Circ. Syst. Video Technol.

    (2014)
  • Z. Pan et al.

    Early MERGE mode decision based on motion estimation and hierarchical depth correlation for HEVC

    IEEE Trans. Broadcasting

    (2014)
  • J. Xiong et al.

    A fast HEVC inter CU selection method based on pyramid motion divergence

    IEEE Trans. Multimedia

    (2014)
  • L. Shen et al.

    Fast CU size decision and mode decision algorithm for HEVC intra coding

    IEEE Trans. Consumer Electron.

    (2013)
  • Y. Su et al.

    Fast multiple reference frame motion estimation for H.264/AVC

    IEEE Trans. Circ. Syst. Video Technol.

    (2006)
  • D. Jun et al.

    An efficient priority-based reference frame selection method for fast motion estimation in H.264/AVC

    IEEE Trans. Circ. Syst. Video Technol.

    (2010)
  • Cited by (98)

    • Fused GRU with semantic-temporal attention for video captioning

      2020, Neurocomputing
      Citation Excerpt :

      In [32], they develop a so-called correlation component manifold space learning (CCMSL) to learn a common feature space by capturing the correlations between the heterogeneous databases. In [33], they propose a content similarity based fast reference frame selection algorithm for reducing the computational complexity of the multiple reference frames based inter-frame prediction. Therefore, in this paper we focus on studying how to design an effective approach, which is capable of utilizing semantic concepts to improve video captioning.

    • Detection Tampering in Digital Video in Frequency Domain using DCT with Halftone

      2024, International Journal of Computing and Digital Systems
    • Fast Skip Inter Coding Decision Algorithm for VVC

      2023, Communications in Computer and Information Science
    View all citing articles on Scopus

    This paper has been recommended for acceptance by Zicheng Liu.

    View full text