Perceptually-Aware Live VBR Encoding Scheme for Adaptive AVC Streaming


Event Time

Originally Aired - Saturday, April 15   |   1:50 PM - 2:10 PM PT

Event Location

Pass Required: NAB Show Conference Pass

Don't have this pass? Register Now!

Info Alert

Create or Log in to myNAB Show to see Videos and Resources.

Videos

Resources

{{video.title}}

Log in to your myNAB Show to join the zoom meeting!

Resources

Info Alert

This Session Has Not Started Yet

Be sure to come back after the session starts to have access to session resources.


Currently, a fixed set of bitrate-resolution pairs termed "bitrate ladder" is used in live streaming applications. Similarly, two-pass variable bitrate (VBR) encoding schemes are not used in live streaming applications, to avoid the additional latency added by the first-pass. Bitrate ladder optimization is necessary to: (i) decrease storage or delivery costs and/or (ii) increase Quality of Experience (QoE). Using two-pass VBR encoding improves compression efficiency, owing to better encoding decisions in the second-pass encoding using the first-pass analysis.

In this light, this paper introduces a perceptually-aware constrained Variable Bitrate (cVBR) encoding Scheme (Live VBR) for HTTP adaptive streaming applications, which includes a joint optimization of the perceptual redundancy between the representations of the bitrate ladder, maximizing the perceptual quality (in terms of VMAF) and optimized constant rate factor (CRF). Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features for every video segment (namely: brightness, spatial texture information, and temporal activity are extracted to predict perceptually-aware bitrate ladder for encoding). Experimental results show that, on average, Live VBR yields bitrate savings of 18.80% and 32.59% to maintain the same PSNR and VMAF, respectively, compared to the reference HTTP Live Streaming (HLS) bitrate ladder Constant Bitrate (CBR) encoding using x264 AVC encoder without any noticeable additional latency in streaming accompanied by a 68.96% cumulative decrease in storage space for various representations, and a 28.25% cumulative decrease in energy consumption, considering a perceptual difference of 6 VMAF points.


Presented as part of:

OTT / Connected TV

Speakers

Reinhard Grandl
VP Product
Bitmovin