|
发布时间: 2018-02-16 |
图像处理和编码 |
|
|
收稿日期: 2017-06-12; 修回日期: 2017-10-16
基金项目: 江苏省重点研发计划—产业前瞻与共性关键技术(BE2016009-3)
第一作者简介:
仲伟波(1975-), 男, 副教授, 2005年于中国矿业大学(北京)获地球探测与信息技术专业博士学位, 主要研究方向为信号与信息处理、模式识别与智能系统。E-mail:vebopost@sohu.com.
中图法分类号: TP391
文献标识码: A
文章编号: 1006-8961(2018)02-0155-08
|
摘要
目的 为了提升高效视频编码(HEVC)的编码效率,使之满足高分辨率、高帧率视频实时编码传输的需求。由分析可知帧内编码单元(CU)的划分对HEVC的编码效率有决定性的影响,通过提高HEVC的CU划分效率,可以大大提升HEVC编码的实时性。方法 通过对视频数据分析发现,视频数据具有较强的时间、空间相关性,帧内CU的划分结果也同样具有较强的时间和空间相关性,可以利用前一帧以及当前帧CU的划分结果进行预判以提升帧内CU划分的效率。据此,本文给出一种帧内CU快速划分算法,先根据视频相邻帧数据的时间相关性和帧内数据空间相关性初步确定当前编码块的编码树单元(CTU)形状,再利用前一帧同位CTU平均深度、当前帧已编码CTU深度以及对应的率失真代价值决定当前编码块CTU的最终形状。算法每间隔指定帧数设置一刷新帧,该帧采用HM16.7模型标准CU划分以避免快速CU划分算法带来的误差累积影响。结果 利用本文算法对不同分辨率、不同帧率的视频进行测试,与HEVC的参考模型HM16.7相比,本文算法在视频编码质量基本不变,视频码率稍有增加的情况下平均可以节省约40%的编码时间,且高分辨率高帧率的视频码率增加幅度普遍小于低分辨率低帧率的视频码率。结论 本文算法在HEVC的框架内,利用视频数据的时间和空间相关性,通过优化帧内CU划分方法,对提升HEVC编码,特别是提高高分辨率高帧率视频HEVC编码的实时性具有重要作用。
关键词
帧内编码单元; 快速划分; 时空相关性; 高效视频编码(HEVC)
Abstract
Objective The coding rate of high efficiency video coding (HEVC) can be reduced by approximately 50% compared with H.264/AVS, with nearly the same video coding quality. However, the coding complexity of HEVC increases exponentially at the same time. In particular, high resolution and high frame rate videos require additional coding time for HEVC. The coding time of HEVC must be reduced to satisfy the requirements of real-time coding and transmission for high-resolution and high frame rate videos. Statistics show that intra-coding unit (CU) segmentation comprises approximately 99% of the total coding time in HEVC, and the efficiency of the CU segmentation has a decisive impact on the efficiency of HEVC. The real-time coding of HEVC can be promoted significantly by optimizing the CU segmentation method used in HEVC. Many methods, such as reducing the traversal range of the depth of the CU, decreasing the rate-distortion cost calculation, and skipping the intra prediction of large CU, have been used in optimizing the CU segmentation method used in HEVC. Method A robust spatial and temporal correlation exists among consecutive frames in video data. Coding tree unit (CTU) exhibits a strong correlation with the CTU of the same position in the consecutive frames and the surrounding CTUs in the same frame in HEVC. According to statistics, approximately 71.5% of CTUs of the same position in the current and consecutive frames provide the same depth, and the correlation between the consecutive frames in the gentle video is stronger than in the dramatic video. Therefore, the CTU of the current frame can be estimated based on the CTUs of its previous frame. The rate-distortion cost ratio between the CTUs of the same position in the consecutive frames is mostly between 0.8 and 1.2. Statistics denotes that the rate-distortion cost ratio is near 1.0 for gentle videos and far from 1.0 for dramatic videos. The rate-distortion value of the current CTU can be estimated based on the rate-distortion value of the CTUs of the same position in the previous frames, which can be used to accelerate CU segmentation. According to the above characteristics, a fast intra CU splitting algorithm is proposed in this paper. In this algorithm, the CTU is determined preliminarily based on the CTU of the same position in the previous frame and its adjacent CTUs. The CTU is finally determined based on the average depth of the CTU of the same position in the previous frame, weighted average depth of its adjacent CTUs, standard deviation of the brightness of CU, and corresponding rate-distortion cost value. All parameters used in the algorithm are obtained according to the actual video. The proposed method can significantly reduce the intra CU splitting time. The selected refresh frame adopts the standard CU partition method used in HM16.7 to avoid errors caused by the cumulative effects in this fast CU splitting algorithm at the interval-specified frames in the video. All the codes of the proposed algorithm were written in C++ based on the HM16.7, which is a popular framework for HEVC. Result The proposed method was used in many different resolution and frame rate videos to verify the feasibility and reliability of the method. Experimental results show that this algorithm can maintain the video quality and save approximately 40% encoding time with only nearly 1.4% increase in video coding rate, approximately 2.93% increase in the BDBR (bjøntegaard delta bitrate) of a video, and approximately 0.17 dB decrease in the BD-PSNR (bjøntegaard delta peak signal-to-noise rate) compared with HM16.7. The statistical results indicate that the absolute values of BDBR and BD-PSNR have a decreasing trend with the increase in video resolution, and the increment in video coding rate for high-resolution, high frame rate videos is generally smaller than that for low-resolution, low-frame rate videos. Conclusion The analysis of the experimental results shows that the proposed algorithm based on HEVC framework HM16.7 can reduce the video coding time by using the spatial and temporal correlation in video data to decrease the time used for intra-CU splitting. The algorithm skips the rate-distortion calculation of the CU with zero depth and uses the similarity of the CTUs in consecutive frames to determine the CTUs of the current frame in advance. The method is feasible, reliable, and can improve the real-time performance of HEVC significantly, especially for the highresolution, high frame rate videos. The proposed algorithm is more suitable for high-resolution and high frame rate videos and has a better effect for all I-frame encoding schemes than for low-latency and random-access encoding schemes. The proposed algorithm should be continuously optimized to achieve minimal coding time, reduced coding rate, and enhanced coding quality not only for low-resolution and lowframe rate videos but also for different HEVC coding modes.
Key words
intra coding unit; fast segmentation; spatial-temporal correlation; high efficiency video coding
0 引言
为了满足高分辨率和高帧率视频压缩的要求,2013年视频编码联合协作小组正式发布了视频编码标准HEVC(high efficiency video coding)。与H.264/AVS相比,在相同的图像质量前提下,HEVC编码码率比H.264/AVS大约减少了50%,但其编码复杂度也成倍增加[1-3]。在HEVC帧内编码中,CU(coding unit)的划分占总编码时间的99%以上[4],帧内CU划分效率的提升对HEVC编码的实时性具有重要的意义。
目前已有多种帧内CU快速划分算法,如文献[5]先利用视频序列的空间相关性决定CU深度遍历范围,然后再根据周围已编码CU深度相关性决定是否跳过当前CU的率失真代价值计算,该算法平均减少约16%的编码时间;文献[6]利用贝叶斯估计方法,统计各深度CU对应的率失真代价值,实现各深度CU的提前划分与提前终止,该算法在保证编码前后峰值信噪比(PNSR)基本不变的情况下码率稍微增加,但可大量节约编码时间;文献[7]提出基于率失真代价值的统计特性来提前终止CU的划分,通过统计实验获得不同深度的CU终止划分的率失真代价阈值,若当前深度下CU率失真代价值小于对应深度阈值,则当前CU不再向下划分,该算法只是考虑了率失真代价值的统计特性,仅对特定视频序列有很好的效果;文献[8]利用周围已编码CU深度预测当前CU的深度,根据当前CU的预测深值将编码块分成四类,该算法仅利用了视频序列的空间相关性,编码时间平均减少约21.1%;文献[9]基于Sobel边缘检测算子计算一帧中各深度边缘点阈值,缩小后面若干帧中CU遍历的深度范围,再利用率失真代价阈值在缩小的深度范围内提前终止CU划分,该算法将Sobel算子的深度范围检测和率失真代价值统计相结合,提高了CU划分速度。文献[10]结合纹理属性和空间相邻CU信息来跳过大尺寸CU的帧内预测。文献[11]利用CU的亮度方差值确定CU大小,编码时间减少了40%~70%。上述帧内编码块快速划分算法多利用了帧内的空间相关性或视频数据的统计特性,但并未充分考虑视频序列相邻帧之间的时间相关性。
本文利用高帧率视频序列中相邻帧间的强时间相关性和高分辨率视频帧内的强空间相关性,给出了一种帧内编码块快速划分算法。首先通过前一帧CTU(coding tree unit)深度均值和周围已编码CTU深度加权平均值来决定是否跳过当前CU深度为0时的率失真代价值计算,如果不跳过,需要从深度0开始率失真代价值计算,否则根据前一帧同位CTU进行如下处理:如果同位CU深度是1,则根据前后两帧同一位置处的CU率失真代价值决定是否在前一帧CTU基础上进行CU块的分割;如果同位CU的深度等于2,则根据同一位置两个CU的亮度分量标准差比值决定在前一帧CTU基础上合并还是分割;如果同位CU的深度等于3,则分别计算深度为3的当前CU和向上合并后CU的率失真代价值决定是否在前一帧CTU基础上进行CU块的合并。此外,为了避免上述快速划分方法所带来的误差积累影响后续视频编码,每间隔预定帧数设一个刷新帧,该刷新帧采用标准算法进行CU划分。实验结果表明,本文提出的快速算法与参考模型HM16.7[12]相比,在视频质量基本不变的情况下,视频码率平均增加约1.4%,但可节省约40%的编码时间。
1 编码单元划分
在HEVC参考模型HM (HEVC test model)中,纹理平坦区域采用较大的CU块,纹理复杂的区域采用较小的CU块。HM依次计算所有CU组合方式的率失真代价值,选择率失真代价值最小的CU组合方式作为最终的划分结果。HM中采用的率失真代价值计算公式为
$ {J_{{\rm{mode}}}} = SSE + {\lambda _{{\rm{mode}}}} \times {B_{{\rm{mode}}}} $ | (1) |
式中,
视频序列空域上存在一定的空间相关性,文献[5, 8-9]中都采用了视频序列空间相关性来提高编码器的编码速度。视频序列除了空间域上存在相关性,相邻两帧图像之间还存在时间相关性,尤其对于高帧率视频,这种时间相关性更强, 文献[13-14]利用了这种特性减少了预测模式选择次数,本文将利用这一特性减少CU的划分次数。为了进一步分析当前CTU与同位CTU在划分方式上的相关性,在参考模型HM16.7中选用全帧内编码配置方案,量化参数
表 1
相邻帧同位CU间深度差值统计表
Table 1
The statistics of depth difference between collocated CUs in adjacent frames

序列 | 深度差值/% | |||||
-1 | 0 | 1 | [-1, 1] | 在[-1, 1]外 | ||
变化剧烈 | Traffic | 14.41 | 67.33 | 14.04 | 95.78 | 4.22 |
PeopleOnStreet | 15.36 | 65.27 | 15.3 | 95.94 | 4.06 | |
BasketballDrive | 18.86 | 57.08 | 19.29 | 95.23 | 4.77 | |
BasketballDrill | 14.99 | 65.99 | 14.55 | 95.53 | 4.47 | |
RaceHorses | 15.97 | 63.99 | 15.67 | 95.63 | 4.37 | |
变化一般 | BQTerrace | 10.94 | 75.35 | 11 | 97.3 | 2.7 |
FourPeople | 9.92 | 77.97 | 9.69 | 97.58 | 2.42 | |
BasketballPass | 8.93 | 78.54 | 10.2 | 97.66 | 2.34 | |
变化缓慢 | BlowingBubbles | 8.11 | 82.84 | 8.01 | 98.96 | 1.04 |
Johnny | 8.97 | 80.24 | 9.14 | 98.36 | 1.64 | |
平均值 | 12.65 | 71.46 | 12.69 | 96.80 | 3.20 |
从表 1可以看出:当前CTU深度与前一帧同位CTU深度相同的比例平均占到了71.46%,与前一帧同位CTU深度差的绝对值不大于1的比例平均约为96.8%,且运动缓慢视频序列的时间相关性比运动剧烈视频序列的时间相关性更强。由此可知,视频相邻帧之间有很强的相关性,当前编码块的划分方式可以在前一帧同位编码块的基础上进行合并或分割。
相邻两帧之间除了在划分深度上存在相关性,在率失真代价值上也存在着很强的相关性[15]。对视频编码合作组(JCT-VC)推荐的通用测试条件中列出的五类不同分辨率视频测试序列,先求出测试序列相邻帧同位CTU的率失真代价值,然后计算后一帧CTU的率失真代价值与前一帧同位CTU的率失真代价值的比值,得出相邻两帧同位CTU之间率失真代价值比值变化直方图,如图 1所示。
由图 1可知,同位CTU之间的率失真代价比值绝大多集中在区间[0.8, 1.2]之间,且近似服从正态分布。视频序列两帧对应位置的像素值不是固定不变,变化相对较小的地方对应CTU的率失真代价值比值会在1左右稍微变动,变化较快的地方对应CTU的率失真代价值比值的变化较大。因此对于前后两帧相关性很强的地方,可以从同位CTU的率失真代价值预测出当前CTU的最有可能率失真代价的最小值,编码过程中如果当前CU率失真代价值小于预测的最有可能率失真代价的最小值,即可终止当前CU分割。
根据上述分析,本文利用高帧率、高分辨率视频的时空相关性对CU划分算法改进如下:
1) 若当前帧是第1帧或刷新帧,使用HM标准算法,否则进入步骤2)。
2) 判定当前最大编码单元(LCU)的纹理复杂度,计算相邻同位CTU平均深度值
$ {D_{{\rm{co}}}} = \frac{{\sum\limits_{i = 0}^{255} D }}{{256}} $ | (2) |
$ {D_{{\rm{pre}}}} = {\alpha _1} \times {D_{{\rm{left}}}} + {\alpha _2} \times {D_{{\rm{lu}}}} + {\alpha _3} \times {D_{{\rm{up}}}} $ | (3) |
式中,
3) 如果前一帧同位CU的深度为1,则直接在前一帧同位CTU基础之上进行分割,并确定当前CU终止划分的阈值
$ T = \frac{{a \times {J_{{\rm{co}}}}}}{{{4^{Depth}}}} $ | (4) |
式中,
4) 如果前一帧同位CU的深度等于2,计算同位CU块的亮度分量像素标准差
$ S = \frac{1}{N}\sqrt {\sum\limits_{i = 0}^{N-1} {\sum\limits_{j = 0}^{N-1} {{{\left( {p\left( {i, j} \right)-\bar p} \right)}^2}} } } $ | (5) |
式中,
据统计,
5) 如果前一帧同位CU的深度等于3,计算深度为3的当前CU的率失真代价值和向上合并的率失真代价值,若合并的率失真代价值较小则在同位CU基础上合并,否则与同位CU的划分方式保持不变。
6) 若当前CU的深度为3,前一帧同位CU未划分成4个预测单元(PU)且
本文算法流程如图 5所示。
2 实验结果与分析
为了验证算法的性能,本文在HM16.7上对5类标准视频测序序列(A~E)进行编码验证, 编码帧都为I帧,参考模型HM使用全帧内编码配置方案,分别在
$ \Delta T = \frac{{{T_{{\rm{proposed}}}}-{T_{{\rm{HM}}}}}}{{{T_{{\rm{HM}}}}}} \times 100\% $ | (6) |
$ \Delta BR = \frac{{B{R_{{\rm{proposed}}}}-B{R_{{\rm{HM}}}}}}{{B{R_{{\rm{HM}}}}}} \times 100\% $ | (7) |
$ \Delta PSN{R_{\rm{Y}}} = PSN{R_{{\rm{proposed}}}}-PSN{R_{{\rm{HM}}}} $ | (8) |
式中,
表 2
本文算法与HM16.7比较的统计数据
Table 2
The comparison between algorithm of this paper and algorithm of HM16.7

序列名称 | 分辨率 | 本文算法与HM16.7标准算法的比较 | ||||
Traffic | A(2 560×1 600) | 0.99 | -0.046 | 1.88 | -0.10 | -44.96 |
PeopleOnStreet | 1.39 | -0.070 | 2.63 | -0.15 | -45.00 | |
Kimono | B(1 920×1 080) | 0.24 | -0.007 | 0.42 | -0.02 | -16.27 |
ParkScene | 0.54 | -0.037 | 1.44 | -0.06 | -38.00 | |
Cactus | 1.24 | -0.042 | 2.44 | -0.09 | -37.61 | |
BasketballDrive | 0.72 | -0.019 | 1.46 | -0.04 | -27.96 | |
BQTerrace | 1.47 | -0.053 | 2.41 | -0.15 | -41.10 | |
BasketballDrill | C(832×480) | 2.15 | -0.100 | 4.32 | -0.20 | -45.63 |
BQMall | 1.89 | -0.140 | 4.33 | -0.27 | -44.72 | |
PartyScene | 1.64 | -0.222 | 4.66 | -0.36 | -46.06 | |
BasketballPass | D(416×240) | 1.49 | -0.095 | 3.34 | -0.19 | -41.48 |
BQSquare | 3.25 | -0.272 | 6.63 | -0.57 | -47.46 | |
BlowingBubbles | 0.91 | -0.126 | 3.09 | -0.18 | -43.45 | |
FourPeople | E(1 280×720) | 1.51 | -0.054 | 2.57 | -0.15 | -44.80 |
Johnny | 1.24 | -0.036 | 2.23 | -0.09 | -38.56 | |
KristenAndSara | 1.90 | -0.055 | 3.07 | -0.16 | -43.19 | |
平均值 | 1.4 | -0.085 9 | 2.93 | -0.17 | -40.4 |
从表 2中可以看出,与HM16.7相比,本文算法可以平均节省40.4%的编码时间,而对视频编码质量的影响很小,甚至忽略不计,视频码率增加仅为1.4%。且随着视频分辨率的增加,BDBR和BD-PSNR的绝对值有减小的趋势,因此本文算法更加适合时空相关性更强的高帧率高分辨视频序列。其中Kimono序列编码减少时间相对其他测试序列少很多,原因是Kimono序列比较平坦,深度为0的块所占比例相对较大,所以本文快速算法步骤2)中很少可以跳过序列中深度为0的率失真代价值计算,对此序列效果不明显,编码时间减少有限。
3 结论
针对HEVC中所采用的CU四叉树划分技术造成了编码器的复杂度大幅度增加的问题,本文充分利用视频序列的时空相关性,提出了HEVC帧内编码单元快速划分算法。该算法跳过了深度为0的CU的率失真代价值计算过程,并且使用相邻帧划分方式的相似性来提前决定当前编码单元的划分方式。实验结果表明,与参考模型HM16.7相比,本文算法对视频编码质量的影响很小,甚至可以忽略不计,此时平均可以节约40.4%的编码时间,而视频码率增加仅为1.4%,视频的BDBR平均增加了2.93%,BD-PSNR仅仅下降了0.17 dB,有效提高了HEVC编码的实时性。本文算法对于高分辨高帧率视频有着更好的效果,对于低分辨的视频效果并不是很理想,对低延时编码结构和随机访问编码结构的效果逊色于全I帧编码结构,因此下一步工作就是优化本文算法对低分辨率视频的编码效果,同时验证本文给出的方法在HEVC不同模式下的效果和可靠性。
参考文献
-
[1] Sullivan G J, Ohm J, Han W J, et al. Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1649–1668. [DOI:10.1109/TCSVT.2012.2221191]
-
[2] Bossen F, Bross B, Suhring K, et al. HEVC complexity and implementation analysis[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1685–1696. [DOI:10.1109/TCSVT.2012.2221255]
-
[3] Lainema J, Bossen F, Han W J, et al. Intra coding of the HEVC standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1792–1801. [DOI:10.1109/TCSVT.2012.2221525]
-
[4] Zhu W M. Research on HEVC fast intra coding algorithms based on quadtree structure[D]. Nanjing: Nanjing University of Posts and Telecommunications, 2015. [朱惟妙. 基于四叉树结构的HEVC快速帧内算法研究[D]. 南京: 南京邮电大学, 2015.] http://cdmd.cnki.com.cn/Article/CDMD-10293-1015731124.htm
-
[5] Cen Y F, Wang W L, Yao X W. A fast CU depth decision mechanism for HEVC[J]. Information Processing Letters, 2015, 115(9): 719–724. [DOI:10.1016/j.ipl.2015.04.001]
-
[6] Cho S, Kim M. Fast CU splitting and pruning for suboptimal CU partitioning in HEVC intra coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2013, 23(9): 1555–1564. [DOI:10.1109/TCSVT.2013.2249017]
-
[7] Kim J, Choe Y, Kim Y G. Fast coding unit size decision algorithm for intra coding in HEVC[C]//Proceedings of IEEE International Conference on Consumer Electronics. Las Vegas: IEEE, 2013: 637-638. [DOI: 10.1109/ICCE.2013.6487050]
-
[8] Shen L Q, Zhang Z Y, An P. Fast CU size decision and mode decision algorithm for HEVC intra coding[J]. IEEE Transactions on Consumer Electronics, 2013, 59(1): 207–213. [DOI:10.1109/TCE.2013.6490261]
-
[9] Qi M B, Chen X L, Yang Y F, et al. Fast coding unit splitting algorithm for high efficiency video coding intra prediction[J]. Journal of Electronics & Information Technology, 2014, 36(7): 1699–1705. [齐美彬, 陈秀丽, 杨艳芳, 等. 高效率视频编码帧内预测编码单元划分快速算法[J]. 电子与信息学报, 2014, 36(7): 1699–1705. ] [DOI:10.3724/SP.J.1146.2013.01148]
-
[10] Shen L Q, Zhang Z Y, Liu Z. Effective CU size decision for HEVC intracoding[J]. IEEE Transactions on Image Processing, 2014, 23(10): 4232–4241. [DOI:10.1109/TIP.2014.2341927]
-
[11] Nishikori T, Nakamura T, Yoshitome T, et al. A fast CU decision using image variance in HEVC intra coding[C]//Proceedings of IEEE Symposium on Industrial Electronics and Applications. Kuching, Malaysia: IEEE, 2013: 52-56. [DOI: 10.1109/ISIEA.2013.6738966]
-
[12] JCT-VC HEVC[CP/OL]. 2015-10-13[2017-5-15]. https://hevc.hhi.fraunhofer.de/trac/hevc/browser#tags/HM16.7.
-
[13] Zhong W B, Meng Y R. Fast multiple reference frame selection algorithm for H.264 based on temporal and spatial correlation[J]. Computer Science, 2013, 40(5): 93–95. [仲伟波, 孟艳茹. 基于时空相关的H.264多参考帧快速选择算法[J]. 计算机科学, 2013, 40(5): 93–95. ] [DOI:10.3969/j.issn.1002-137X.2013.05.024]
-
[14] Li Y, He X H, Zhong G Y, et al. A fast inter-frame prediction unit mode decision algorithm for high efficiency video coding based on temporal correlation[J]. Journal of Electronics & Information Technology, 2013, 35(10): 2365–2370. [李元, 何小海, 钟国韵, 等. 一种基于时域相关性的高性能视频编码快速帧间预测单元模式判决算法[J]. 电子与信息学报, 2013, 35(10): 2365–2370. ] [DOI:10.3724/SP.J.1146.2013.00028]
-
[15] Park S J. CU encoding depth prediction, early CU splitting termination and fast mode decision for fast HEVC intra-coding[J]. Signal Processing Image Communication, 2016, 42: 79–89. [DOI:10.1016/j.image.2015.12.006]