[cairo] Planar YUV support

Wed May 12 13:04:33 PDT 2010

Hi Benjamin,

> Here's an update on my work on YUV planar support, also known as "what
> we got from the gstreamer-cairo hackfest". I've reworked my experimental
> code from back then to incorporate the things we agreed on. It is not
> complete yet, but only misses details. So if you're a maintainer of the
> libraries in question, now is a good time for a review and complaints or
> issues with the general design of the code.

First of all, thanks for working on YCrCb support. There is a lot of
interesting things that become possible with this. If all goes as
planned, this will be the major new feature in 0.20.0.

> Details missing in the implementation that I intend to fix:
> - Getting the fetchers right for subsampled formats and different
> filters.

This has to do with an overall concern I have, which is about how
YCrCb fits in with pixman's image processing pipeline, and how to do
filtering and subsample locations.

The existing 8 bits R'G'B' pipeline looks like this:

     * Convert each image sample to a8r8g8b8
     * Extend grid in all directions according to repeat mode
     * Interpolate between samples according to filter
     * Transform
     * Resample
     * Combine
     * Store

Your patches add YCbCr support at the first stage in the sense that it
converts each Y sample to a8r8g8b8 using the nearest neighbor Cb and
Cr samples.

The problem with this scheme is that if we add bilinear interpolation
and support for MPEG-2 sampling structure, then those features would
also have to be added at the fetching stage. And then the later
interpolation stage would either do the wrong thing by interpolating
_again_, or it would have to be disabled and the fetch stage would
have to deal with fetching with transformed coordinates. Basically,
this gets really messy quickly.

Instead, we need to make some changes to the pipeline. A YCbCr image
is an RGB image that has had two things done to it:

     1. It was converted to YCbCr
     2. It had its Cb and Cr components subsampled

To get such an image back to RGB where we can composite with it, we
have to reverse each of those transformations: First we have to
reverse the subsampling, then we have to reverse the color coding.

Reversing subsampling is a form of interpolation, so it seems natural
to do it as part of the existing interpolation step. Since color
conversion has to happen after interpolation, this then implies that
the first stage can no longer be "convert to argb", but instead must
simply be "widen to 8 bits, while keeping the existing color coding".

The location of the chroma sample points varies from format to
format. This means that the interpolation code will have to apply
special adjustments when it computes which samples to use for any
given intermediate point.

The new pipeline then looks like this:

      * Widen to 8 bit components
      * Extend 
      * Interpolate between samples according to filter
      * Transform
      * Convert to RGB coding
      * Resample
      * Combine
      * Store

The PIXMAN_COLOR_SPACE_YCBCR_* entries in the pixman_color_space_t
enum in the branch would be used in the 'Convert to RGB coding' stage.

But the PIXMAN_COLOR_SPACE_ARGB_UNMULTIPLIED doesn't fit in here
because premultiplication is something that has to happen _before_
interpolation, whereas color decoding needs to happen after. This
suggests to me that those two things don't belong in the same enum. I
do think support for unpremultiplied formats is worthwhile, but it
seems orthogonal to YCrCb support.

In practical terms, the above means YCrCb processing will have to go
through the bits_image_fetch_transformed() path and that the
fetch_scanline() and fetch_pixel() function pointers should be
considered *sample* fetchers that simply widen and complete the
samples wrt. their existing color coding, but doesn't try to do
interpolation or color conversion. The x and y coordinates passed to
them must then always be integers and refer to samples. If you pass
(1, 1) to one of those fetchers, the result will be the second sample
on the second row of the component in question.

It is then the job of the code in fetch_transformed() to ask the
fetchers for the correct samples for a given pair of (fractional)
coordinates, and then interpolate between those samples.

Since the color doing will have to be preserved all the way up to the
fetch_transformed() stage, we will need separate fetchers for the Y,
Cb and Cr components. The transforming code path will have to figure
out which fetchers to use.

In principle we could have separate A, R, G, B fetchers as well, but
we can leave that alone for now since we don't have subsampled argb
formats.

So the way I'd write this code is to add a new enum

    COMPONENT_ARGB
    COMPONENT_Y
    COMPONENT_CB
    COMPONENT_CR

in pixman-bits-image.c and then extend the various
fetch_pixel_<filter> to take that enum as an argument.

The bits_image_fetch_pixel_filtered() function then is extended to
check the color coding for the image in question and if it is YCbCr,
fetch the three components separately and then do the color
conversion. Ie.,

    if (is_ycbcr)
    {
        x_Y, y_Y = x, y;

        x_Cb, y_Cb = (adjust for subsampling structure);
        x_Cr, y_Cr = (adjust for subsampling structure);
    }

    switch (image->common.filter)
    {
    case PIXMAN_FILTER_NEAREST:
    case PIXMAN_FILTER_FAST:
        if (is ycbcr)
        {
            y = fetch_nearest (COMPONENT_Y, x, y);
            cb = fetch_nearest (COMPONENT_CB, x_Cb, y_Cb)
            cr = fetch_nearest (COMPONENT_CR, x_Cr, y_Cr);

            return color_convert (image, y, cb, cr);
        }
        else
        {
            return fetch_nearest (image, COMPONENT_ARGB, x, y);
        }
        break;

        ...;
    }

where get_pixel() then finally picks the right fetcher to use
based on the enum value.

Maybe your idea of eliminating the get_pixel() altogether and just use
fetch_scanline() with a length of one could make this simpler.

Thanks,
Soren