[cairo] [RFC] Color space API (partial proposal)

Sun Feb 28 01:36:34 PST 2010

>> Also, I'm ignoring the video-oriented formats as mere convolutions and
>> compressions of RGB.  AFAIK, you can't treat a YUV420 (for example)
>> image as a destination directly, only as a source.
>
> This is a good point. These are not "color spaces" as far as color
> management is concerned. You cannot mix in YUV numbers directly (because
> black is not 0,0,0) so it cannot be a usable blending space.

Right.  To a certain extent you need to consider them for colour
management, because there are several well-defined but incompatible
ways of getting from YUV to sRGB and back.  However, they cannot be
used as a blending space - and they cannot be used as a destination
format (at least not without losing information).  I consider those
two latter facts to be instructive.

I'm not quite sure whether L*a*b* would count as an additive colour
space suitable for blending in.  The fact that the ab components are
signed would presumably complicate that, so let's skip that question
for now.  But XYZ is definitely just as suitable as RGB or CMYK for
blending in.

>> 2) Introduce an extended operation, which replaces the traditional D =
>> (S * M) OP D with D = (colour(S) * M) OP D.  The colour(S) operation
>> is a colour-management transform specified by the user, and the
>> operation is legal as long as the result of colour(S) would be legal
>> for the source operand in a normal operation.
>
> As far as I can tell this is exactly what is proposed. The function
> colour(S) however is not directly specified by the user, but is chosen by
> Cairo to translate the color space of S to the color space of D. If the
> transformation is the identity this then is the same as normal behavior
> without the function. Cairo must detect that this is the identity and make
> that code path fast.
>
> The change in api is that all colors sent to Cairo have a color space
> attached to them, and all surfaces have a "blending" space. There may also
> be a conversion from the blending space to the device color space when the
> drawing is finally put on the screen.

I think most of the controversy revolves around the fact that there is
no single method of converting between any two colour spaces that will
satisfy two arbitrary different colour geeks.  Let me explain that a
bit further...

I believe that we have several different target user groups here:

1) People who target additive display devices (namely screens, not
paper), and are comfortable assuming that the whole world runs on
sRGB.  Cairo already caters to this group pretty well.

2) People who have heard of four-ink blacks, spot colours and
bleeding, but don't care about subtle differences in the precise hue
of magenta their printer uses.  They're more concerned with shapes and
impact, not colour.  This covers the vast majority of "business
graphics" and DTP needs.  They may or may not have heard of L*a*b*.
This group also includes people who want to deal with the weird and
stupid formats that video uses, as painlessly as possible.

3) Colour geeks, who will go on the Internet and rant for hours and
days upon end if we make the wrong assumption about how they want to
do something.  We need to give them extremely low-level control with
very very few assumptions, so that they can predictably tweak it to
their liking.  They also account for perhaps 0.01% of the user
population - a number pulled from thin air, but you get my point.

The way around this is not to assume that the colour spaces of the
source and destination are sufficient to define the transform between
them, but require one to be explicitly selected.  We can provide some
standard (and hopefully fast) ones to make life easier for group 2,
and a general method of defining new transforms to satisfy group 3.
Such general transforms would be capable of performing the
post-processing that *some* workflows require, to the exact
specifications of the user - but will not necessarily be fast.

>> Naturally, printers will probably want to do some post-processing on
>> the final image to make it more suitable for the printer - they might
>> even convert it from CMYK to one of the extended formats, or vice
>> versa, to produce a better result.
>
> I think this is intended. The printer is supposed to convert "blending
> space" to the actual device.

>> Or they might perform a
>> normalisation on the channels (eg. x=min(C,M,Y); CMY -= x; K += x;)
>
> I got the impression that this is *not* wanted. If they wanted that the user
> would leave the blending space at the default of sRGB. This sort of
> modification is done when converting colors between spaces. Once it is in
> the printer's space then exact control over the levels is allowed.

Right.  My point is that this is precisely the kind of controversy
that we're never going to resolve to a single answer, and therefore it
should be out of scope.  If we can somehow provide the colour geeks
with tools that let them achieve what they want by low-level means,
that's a bonus.

>> As far as implementation is concerned, we really just need to extend
>> Pixman to deal with arbitrary numbers of colour channels (from zero up
>> to, say, eight), plus an optional alpha channel.
>
> That is my hope but I have not gotten a clear answer about this. I would
> like the color experts to indicate if "blending space" means that linear add
> and multiply of the values are expected to work in this space. The
> alternative is the color space provides the function but that seems
> unworkable to me and would make it impossible for Cairo to add new
> compositing functions.

I think we have to *define* the colour channels of the blending spaces
as being independent and linear, regardless of which colour space is
involved.  Otherwise we open a massive can of worms as described
above.

Not all colour spaces can therefore be used as blending spaces.  YUV,
xyY and possibly L*a*b* are excluded because their channels are not
independently additive or subtractive.  XYZ, RGB and the various CMYK
variants should work fine.  The only substantial difference between
RGB and CMYK is that the ADD operation always makes things lighter in
RGB, and always darker in CMYK.

>> This gets a lot less
>> cumbersome if we manage to fix a channel order (eg. RGB versus BGR is
>> already a Problem).  As long as the channels match up, Pixman doesn't
>> need to know what each one means.
>
> ABGR is only for 8-bit data and is a special case due to how hardware is
> designed. There is also xBGR 3-channel data that must also be supported.
>
> I think any data size larger than 8 bits can be stored in channel order with
> alpha on the end. The questions that need answering are padding (to a power
> of 2 or a multiple of 4 needed by many vector processors) and non-interlaced
> channels.

That question does need answering.

The other big question is how to specify a generic colourspace
transform at runtime - given that this can involve both matrix
convolves and gamma curves even for conversions between additive
formats, and not always in the same order.  It gets worse if we
encourage using this mechanism for post-processing, or for an
intelligent conversion from additive spaces to the "extended"
subtractive spaces.  The most general answer is to have the user
provide a conversion function, which of course leads back to the
format question when considering the API to it.

The simple format answer would be to define new format-type constants
that identify AXYZ, AK, ACMYK, ACcMmYK and ACMYKGO, and then permit
8-bit integer and 32-bit float component sizes for each, an arbitrary
number of colour components, and optionally ignoring the alpha
channel.  I've put Alpha first in each case, since that is consistent
with existing formats *and* allows adding spot colours to the end of
the list of channels.  This has several results that just fall out:

- CMY space (actually ACMY or xCMY) is a subset of ACMYK, just with
only three colour components.

- AXYZ, ACMY, xXYZ and xCMY can be operated on using the same code as
presently runs ARGB, ABGR, xRGB and xBGR.

- Duotone can be specified using AK with one spot colour (a total of
two colour components) - the spot being the mid-grey ink.  The same
goes for cheap two- or three-ink processes.  However, I suspect that
Duotone would be better as a post-process concept than a blending
space.

- ACMYKGO and ACcMmYK leave room for one spot colour in an 8-way
vector, so can reasonably be padded to that level.

Unfortunately, ACMYK is 5 components, which is rather less efficient
for vector machines.  But is it worth providing a CMYK format (without
alpha) and using a separate mask channel?  Or would it at this point
be better to do it in planar fashion, which would also allow an
arbitrary number of spot colours?

The mind doth boggle.

 - Jonathan Morton