Prediction of motion vector data

All the motion vector data is predicted from previously encoded data from nearest neighbours. In predicting the data a number of conventions are observed.

The first convention is that all the so-called block data (prediction modes and the motion vectors themselves, and/or any DC values) is actually associated with the top-left block of the prediction unit to which they refer. This allows for a consistent prediction and coding structure to be adopted.

Example. If splitting level=1 and common mode is false then the prediction units in a MB are sub-MBs. Nevertheless, the prediction mode and any motion vectors are associated with the top-left block of each sub-MB and values need not be coded for other blocks in the sub-MB.

Figure: data other than splitting level and common mode is always associated with particular blocks, even if the relevant prediction unit is the sub-MB or MB itself.

Example. If MB_split=2 but MB_common=1 then the prediction mode (INTRA, REF1_ONLY etc) need only be coded for the top-left block in the MB. Motion vectors still need to be coded for every block in the MB if the mode is not INTRA.

The second convention is that all MB data is scanned in raster order for encoding purposes. All block data is scanned first by MB in raster order, and then in raster order within each MB. That is, taking each MB in raster order, each block value which needs to be coded within that MB is coded in raster order:

Figure: block data is scanned in raster order by MB and then in raster order within each MB

The third convention concerns the availability of values for prediction purposes when they may not be coded for every block. Since prediction will be based on neighbouring values, it is necessary to propagate values for the purposes of prediction when the MV data has conspired to ensure that values are not required for every block.

Example. In the next diagram, we can see the effect of this. Suppose we are coding REF1_x. In the first MB, splitting level=0 and so at most only the top-left block needs a value, which can be predicted from values in previously coded MBs. As it happens, the prediction mode REF1_ONLY and so a value is coded. The value v is then deemed to be applied to every block in the MB. In the next MB, splitting level=1 and common mode=false, so the unit of prediction is the sub-MB. In the top-left sub-MB the prediction mode is, say, REF1AND2 and so a value x is coded for the top-left block of that sub-MB. It can be predicted from any available values in neighbouring blocks, and in particular the value v is available from the adjacent block.

Figure: For the purposes of prediction, values are deemed to be propagated within MBs or sub-MBs.

Prediction methods

The prediction used depends on the MV data being coded, but in all cases the aperture for the predictor is shown in the figure below. This aperture is interpreted as blocks where block data is concerned and MBs where MB data is concerned.

Figure: Aperture for MV prediction.

The splitting level is predicted as the mean of the levels of the three MBs in the aperture. Likewise, the common mode value is predicted by the mean of the three values in the aperture, by interpreting a boolean value as a 0 or 1.

Of the block data, the prediction mode is also coded as a mean, the various modes being given values from 0 (INTRA) to 3 (REF1AND2). The motion vector data is predicted by taking the median of each component separately. The median helps ensure that the prediction is not strongly biased by large motion vectors.

The DC values are predicted by the average of the three values in the aperture.

In many cases values are not available from all blocks in the aperture, for example if the prediction mode is different. In this case the blocks are merely excluded from consideration. Where only two values are available, the median motion vector predictor reverts to a mean. Where only one value is available, this is the prediction. Where no value is available, no prediction is made, except for the DC values, where 128 is used by default.

In the case of the MB data, the number of possible values is only 3 in the case of MB_split and 2 in the case of MB_common. The prediction therefore can use modulo arithmetic and produces an unsigned prediction residue of 0,1 or 2 in the first case and 0 or 1 in the second. All other predictions produce signed prediction residues.

Previous: Motion vector data coding architecture Next: Motion vector entropy coding

Table of contents Back to Motion vector data coding