RDO motion estimation metric

The performance of motion-estimation and motion-vector coding is absolutely critical to the performance of a video coding scheme.With motion vectors at 1/4 or 1/8th pixel accuracy, a simple-minded strategy of finding the best match between frames can greatly inflate the resulting bitrate for little or no gain in quality because the additional accuracy is very sensitive to noise. What is required is the ability to trade off the vector bitrate with prediction accuracy and hence the bit rate required to code the residual frame and the eventual quality of that frame, whilst at the same time making the estimator more robust.

The simplest way to do this is to incorporate a smoothing factor into the metric used for matching blocks. So the metric consists of a basic block matching metric, plus some constant times a measure of the local motion vector smoothness. The basic block matching metric used by Dirac is Sum of Absolute Differences (SAD). Given two blocks X,Y of samples, this is given by:

SAD(X,Y)=Σi,j|Xi,j-Yi,j|

The smoothness measure used is the difference between the candidate motion vector and the median of the neighbouring previously computed motion vectors. Since the blocks are estimated in raster-scan order then vectors for blocks to the left and above are available for calculating the median:

Figure: neighbouring vectors available in raster-scan order for local variance calculation

The vectors chosen for computing the local median predictor are V2, V3 and V4; this has the merit of being the same predictor as is used in coding the motion vectors.

The total metric is a combination of these two metrics. Given a vector V which maps the current frame block X to a block Y=V(X) in the reference frame, the metric is given by:

SAD(X,Y)+λ(|Vx-predx|+|Vy-predy|)

The value λ is a coding parameter used to control the trade-off between the smoothness of the motion vector field and the accuracy of the match. When λ is very large, the local variance dominates the calculation and the motion vector which gives the smallest metric is simply that which is closest to its neighbours. When λ is very small, the metric is dominated by the SAD term, and so the best vector will simply be that which gives the best match for that block. For values in between, varying degrees of smoothness can be achieved. The parameter &lambda is calculated as a multiple of the RDO parameters for the L1 and L2 frames, so that if the inter frames are compressed more heavily then smoother motion vector fields will also result.

Previous: Motion estimation

Table of Contents Back to Motion Estimation and Compensation