Video Enhancement Network Based on CNN and Transformer

Release Date:2024-12-17 Author:YUAN Lang, HUI Chen, WU Yanfeng, LIAO Ronghua, JIANG Feng, GAO Ying

Abstract: To enhance the video quality after encoding and decoding in video compression, a video quality enhancement framework is proposed based on local and non-local priors in this paper. Low-level features are first extracted through a single convolution layer and then processed by several conv-tran blocks (CTB) to extract high-level features, which are ultimately transformed into a residual image. The final reconstructed video frame is obtained by performing an element-wise addition of the residual image and the original lossy video frame. Experiments show the proposed Conv-Tran Network (CTN) model effectively recovers the quality loss caused by Versatile Video Coding (VVC) and further improves the performance of VVC.

Keywords: attention fusion mechanism; H.266/VVC; transformer; video coding; video quality enhancement

download: PDF