Journal of Visual Communication and Image Representation
Fast reference frame selection based on content similarity for low complexity HEVC encoder☆
Introduction
With the development of capture and display technologies, the full high definition (HD) and ultra HD videos are attracting more and more people’s attention since they can provide higher perception/video quality. However, with the increased video resolution and frame rate, the data volume of the raw HD videos increases dramatically. It is highly desirable to develop high compression techniques due to the current memory and channel bandwidth are still limited. Under this kind of compression rate demand, the joint collaborative team on video coding (JCT-VC) of the ITU-T video coding experts group (VCEG) and ISO/IEC moving picture experts group (MPEG) has developed a state-of-the-art video coding standard named high efficiency video coding (HEVC) [1], [2], [3]. The HEVC can achieve the same subjective visual quality as the H.264/AVC [4] high profile while requiring only about 50% of the bit rate. This obtained coding efficiency benefits from a set of advanced coding tools, such as flexible size unit representation, intraframe prediction with 35 modes, multiple reference frames (MRF) interframe prediction, new in-loop filtering methods, and so on. Meanwhile, the computational complexity of the HEVC encoder increases dramatically as these used coding tools. The high computational complexity becomes a bottleneck for the HD videos and HEVC encoder to be widely used in real-time and low power multimedia applications, such as live video broadcasting, mobile video communication, and video surveillance. Thus, there is a pressing need to reduce the computational complexity of the HEVC encoder.
Recently, many researchers have devoted their efforts on reducing the computational complexity of the HEVC encoder [5], [6], [7], [8], [9], [10], [11]. Based on the Bayesian decision theory and rate distortion (RD) characteristics, Lee et al. proposed a fast coding unit (CU) size decision method for the HEVC [5]. In [6], Shen et al. proposed a CU depth decision method based on the depth selection correlation between the spatial-temporal neighboring CUs and the current CU. Besides, they also proposed an early termination for the motion estimation based on the motion homogeneity, RD cost and Skip mode. By using the CU depth selection information of spatial neighboring CUs, Kim et al. proposed a CU depth range decision method for the HEVC [7]. In [8], the RD-complexity characteristics of the inter prediction was analyzed and derived an efficient inter mode decision method for the HEVC. Based on the CU motion activity and mode selection correlation among hierarchical depth CUs, Pan et al. proposed an early Merge mode decision method for the HEVC fast interframe prediction [9]. By utilizing the estimated optical flow of the downsampled frames, Xiong et al. proposed a fast inter CU selection method for the HEVC [10]. Based on the prediction mode and RD cost correlations among different quadtree depth levels and spatially neighboring CUs, Shen et al. proposed a fast CU size and intra mode decision method for the HEVC [11]. These methods mainly focus on reducing the computational complexity of the flexible size unit representation technique, the HEVC encoding complexity could be further reduced by optimizing the MRF based interframe prediction.
In the last decade, a number of methods have been proposed to reduce the encoding complexity of the MRF based interframe prediction for the H.264/AVC and its extensions [12], [13], [14], [15], [16], [17], [18]. By taking into account the correlation/continuity of motion vectors among different reference frames, Su et al. proposed a fast MRF based motion estimation for the H.264/AVC [12]. Based on the spatial and temporal correlation of the reference frame index and motion vectors, Jun et al. proposed an efficient priority-based MRF selection method for the H.264/AVC fast motion estimation [13]. In [14], Chen et al. proposed a fast MRF based motion estimation for the H.264/AVC by using the stored motion vectors to compose the motion vector without searching all active reference frames. In [15], Liu et al. proposed a fast MRF selection method for the H.264/AVC motion estimation by using the motion activity and Hadamard coefficients. Based on the reference frame selection of the 16 × 16 mode partition, Zhang et al. proposed an efficient MRF selection method for the H.264 based multiview video coding [16]. In [17], Yeh et al. proposed a fast mode decision based MRF selection for H.264 based multiview video coding system by using inter-view rate distortion prediction method. In [18], by using the inter-view and inter-component correlations based fast mode decision, Lei et al. proposed a low complexity MRF decision method for H.264 based multiview depth video coding. These methods can efficiently reduce the computational complexity, however, they were proposed for the H.264, and are not suitable for directly applying into the HEVC encoder due to the different statistical characteristics and different coding tools used in HEVC encoding system. In [19], according to the motion complexity which is computed by the distribution of the best reference frame, the motion vector difference and its associated average distortion, a fast reference frame selection was proposed. However, the MRF encoding time saving of that method is still limited and unstable for the HEVC with random-access-main coding structure due to using of the IBP prediction structure.
In this paper, we propose a fast MRF selection algorithm for the fast HEVC interframe prediction, which is based on the relationship between the content similarity and the reference frame selection. The rest of this paper is organized as follows. The review on the HEVC MRF encoding process is presented in Section 2. Then, the details of the proposed fast MRF selection algorithm are illustrated in Section 3. Section 4 shows the experimental results. Then, an algorithm discussion is given in Section 5. At last, Section 6 concludes this paper.
Section snippets
Review on the HEVC MRF encoding process
As previously video coding standards such as H.264/AVC, the HEVC standard is also a hybrid video encoder. In the HEVC encoding process, each frame is partitioned into a sequence of coding tree units (CTUs), which is the basic unit of coding, and consists of a luma coding tree block (CTB), two chroma CTBs and associated syntax elements of 4:2:0 color sampling. According to the quadtree syntax, the CTU is further split into one or multiple CUs. Then, based on the prediction-type, the CU can be
Encoding complexity analysis on the MRF encoding process
In order to analyze the encoding complexity of the MRF selection process, eight HEVC test sequences (BQSquare, BasketballPass, BQMall, BasketballDrill, FourPeople, Johnny, Cactus, and ParkScene) with various resolutions and motion activities are encoded by the HM12.0 [21] under the HEVC common test conditions [22]. Four quantization parameters (QPs) (22, 27, 32, and 37) are used. The motion estimation search range and method are 64 and TZSearch, respectively. The low-delay-main and
Experimental results
To evaluate the coding performance of the proposed fast MRF selection algorithm, the HEVC reference software HM12.0 is used as the software platform. The hardware platform is Intel Xeon CPU E5-1620 v2 @ 3.70 GHz, 16.0 GB RAM with the Microsoft Windows 7 64-bit operating system. To compare the coding performance in terms of BDPSNR, BDBR, total encoding time saving (TS for short), total reference frame encoding time saving (RTS for short), three 416 × 240 sequences with Class D (BQSquare,
Algorithm discussion
It is well known that when designing the fast algorithms for video coding, the decision accuracy of the proposed algorithm is highly correlated with the coding efficiency. In other words, if the decision accuracy is large and close to 100%, there would be no RD performance degraded; on the contrary, if the decision accuracy is small, the RD performance would be degraded. From Table 2, we can see that the average decision accuracy of the proposed algorithm is 63.26% and 67.74% for the
Conclusion
The MRF encoding process consumes about 70% of total encoding time of an HEVC encoder. To reduce the computational complexity of the MRF encoding process, an early reference frame decision algorithm is proposed in this paper. Since there is high video content similarity between the parent PU and children PUs, the reference frame information including inference frame index and reference frame direction of the children PUs is set according to the parent PU has. Experimental results show that the
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China under Grant 61501246, Grant 61271324, Grant 61471348, Grant 61232016, in part by the Natural Science Foundation of Jiangsu Province of China under Grant BK20150930, in part by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant 15KJB510019, in part by the Natural Science Foundation of Hebei Province of China under Grant F2015202311, in part by the Project through the
References (28)
- et al.
Early detection of all-zero 4 × 4 blocks in high efficiency video coding
J. Visual Commun. Image Represent.
(2014) - et al.
Parallel fast inter mode decision for H.264/AVC encoding
J. Visual Commun. Image Represent.
(2013) - et al.
Calculation of average coding efficiency based on subjective quality scores
J. Visual Commun. Image Represent.
(2014) - Information technology - High efficiency coding and media delivery in heterogeneous environments - Part 2: High...
- et al.
Overview of the high efficiency video coding (HEVC) standard
IEEE Trans. Circ. Syst. Video Technol.
(2012) - et al.
A fast CU size decision algorithm for HEVC
IEEE Trans. Circ. Syst. Video Technol.
(2014) - et al.
An effective CU size decision method for HEVC encoders
IEEE Trans. Multimedia
(2013) - et al.
Selective CU depth range decision algorithm for HEVC encoder
- et al.
Efficient mode decision schemes for HEVC inter prediction
IEEE Trans. Circ. Syst. Video Technol.
(2014) - et al.
Early MERGE mode decision based on motion estimation and hierarchical depth correlation for HEVC
IEEE Trans. Broadcasting
(2014)
A fast HEVC inter CU selection method based on pyramid motion divergence
IEEE Trans. Multimedia
Fast CU size decision and mode decision algorithm for HEVC intra coding
IEEE Trans. Consumer Electron.
Fast multiple reference frame motion estimation for H.264/AVC
IEEE Trans. Circ. Syst. Video Technol.
An efficient priority-based reference frame selection method for fast motion estimation in H.264/AVC
IEEE Trans. Circ. Syst. Video Technol.
Cited by (98)
Face mask detection using deep convolutional neural network and multi-stage image processing
2023, Image and Vision ComputingFused GRU with semantic-temporal attention for video captioning
2020, NeurocomputingCitation Excerpt :In [32], they develop a so-called correlation component manifold space learning (CCMSL) to learn a common feature space by capturing the correlations between the heterogeneous databases. In [33], they propose a content similarity based fast reference frame selection algorithm for reducing the computational complexity of the multiple reference frames based inter-frame prediction. Therefore, in this paper we focus on studying how to design an effective approach, which is capable of utilizing semantic concepts to improve video captioning.
Detection Tampering in Digital Video in Frequency Domain using DCT with Halftone
2024, International Journal of Computing and Digital SystemsDetection Tampering in Digital Video in Frequency Domain using DCT with Halftone
2023, Research SquareInter prediction multiple reference frames impact on H266-VVC encoder
2023, Multimedia Tools and ApplicationsFast Skip Inter Coding Decision Algorithm for VVC
2023, Communications in Computer and Information Science
- ☆
This paper has been recommended for acceptance by Zicheng Liu.