[cairo] Pixel consistency? (testing)

Fri Aug 14 13:00:00 PDT 2009

Hi Ian,

On Fri, 14 Aug 2009, Ian Britten wrote:

> So:
> - When producing images, is all the rendering controlled through
>    cairo(mm) and pixman (And System calls)?  Are there any other
>    factors that might alter the output (eg: Configuration options,
>    64-bit, other dependencies, etc)?
>    In other words, will the images generated by one platform/build
>    be the same as those generated by a different build?

The main controlling factor is the backend you choose, but even then 
there will probably be differences across versions of all the 
components involved.

> - For images generated through Cairo, are they currently
>    consistent across different versions of Cairo?  If no, is it a
>    goal of Cairo to eventually be consistent (Barring bug fixes, etc)?

Not "pixel exact" across versions.  I suspect that's not really a goal 
either.  Having said that, the image backend at least tries to stay 
within a fairly sharp tolerance, but even there there are slight 
variations across Cairo (and presumably also Pixman) versions as the 
underlying rendering code paths change.

> - Can anyone offer any insight into how they approached their
>    testing and these sorts of issues?  Currently, I'm just looking
>    at doing a straight RGB(A) comparison of the pixels, so any
>    difference will be reported as a failure.

It's an on going problem for cairo as well.  The approach the cairo 
test suite takes is: First check for an exact match.  If that fails, 
do a quick check for some maximum proportion of pixels changing very 
slightly up to some per-backend tolerance.  If that fails, call an 
external "perceptual image diff" program[1] to give a yay or nay 
judgement based on a model of the human visual system.  It's all a bit 
fragile however and it's still very easy to provoke test suite 
failures with small differences in how things like antialising or 
filtering are done.

One problem we're hitting is that the perceptual image diff program 
really isn't originally designed for vector graphics, but more for 
things like photorealistic images.  Another is that our test suite 
images are sometimes much too small for it to actually make any 
meaningful judgement.

Some ideas to improve the existing system in cairo and for you to 
consider as well might be to ditch the perceptual diff and use 
something less fancy but more suitable for vector graphics.  For 
example, Behdad suggested doing median filtering on the images to 
patch over the antialising differences.  Another idea for the same 
problem might be to have each test contain an importance mask which 
weights the interesting bits of each individual test image.  After 
all, we don't really need every test to be sensitive to antialiasing 
or filtering artifacts.  Creating these masks could probably be made 
semi-automatic and work reasonably well for tests with solid patterns 
only.  I'm not sure what to do for more complex patterns however.

Cheers,

Joonas

[1] http://pdiff.sourceforge.net