[cairo] Automated testing of Cairo

Fri Aug 18 13:27:01 PDT 2006

On Fri, Aug 18, 2006 at 03:30:58AM -0700, Carl Worth wrote:
> On Fri, 18 Aug 2006 02:13:20 -0700, Bryce Harrington wrote:
> > Well, technically the git clone is sitting on one machine (the test
> > driver), and the errors occur on all of the other systems that are
> > trying to apply the patch the driver produces.
> 
> Sure. But you must have some mechanism of copying files from one to
> the other, (since you're currently copying a patch from one to the
> other somehow).

Actually they get them via the NFS mount, so there's no copying in the
rsync/scp sense.  :-)

Of course, in this case why not put the git trees themselves onto nfs.
That could make sense, although care would need to be taken that the
clients only access them read-only, or else that machines be able to
lock before doing a modification.  Also, there'd need to be care taken
in ensuring there's no race conditions when doing git fetches; e.g., if
you're in the middle of updating the tree, and a client uses the tree to
pull code, will that work or will there be inconsistencies?

> For now, it probably makes sense for us to continue to use
> automake-generated tar files for releases. And then just be sure to
> use something other than those for the use of crucible, (either

There are about 4 different ways I can think to solve this...  ;-)

But the approach that feels the best is to have crucible pull both the
automake-generated version from the website, as well as build its own
tarballs from git for use with the git patches.

One of the reasons why I like to keep the official tarball as a base, is
because that is what lives in the wild, so it ensures our test results
will be directly comparible to anyone else that's running tests on stock
cairo.

> > To be honest, I'm not completely convinced about this, but only because
> > there's just too many things I don't know about git.  For you, git is
> > known and crucible is new, but for me its the opposite.
> 
> Sure. So just take all that I said as me thinking out loud about one
> way to design the system. And in the future when you are more
> comfortable with git and are thinking about system design things, then
> you can think back on what we've talked about here.

Great, sounds good.

Fwiw, we're in the middle of a major refactoring of how the patching
works internally - until now we've had one set of code for patching and
building the kernel, and a different set for everything else, and we are
working to merge them into one piece of code that does both.  The main
motivation is to add more flexibility in how patches can be mixed
together for a given test.  I think this refactoring will also
compartmentalize the patch process such that redoing it around git down
the road may be easier.

> > > 1) SVG failing many tests
> 
> The SVG failures are pretty much gone now with the fix from
> Behdad. The only exception is nfs11, so I don't know yet what is still
> missing there.

Hmm.  I doublechecked the dependencies and versions on that machine and
everything looks okay...  I also reinstalled librsvg, but still no luck.

Looking through the *.log files, three types of errors seem to be
repeating:

1.  Failure to set [xlib|ps|svg] target
2.  Image size mismatch (AxB at C) vs. (A+1xB+1 at D)
3.  XXX pixels differ from reference image

I'm curious about the second error there; it seems to correspond most
closely to the failures unique to this machine.  The fact that the sizes
are off by one, with a wildly different third number (resolution maybe?)
makes me wonder if it could be a platform specific thing?

The unique thing about nfs11 is that it's a 64-bit Xeon, running in an
emulated 32-bit mode.  I don't know enough about how this emulation
works to say whether it'd cause an error like this, though.  What do you
think?

> > > 2) PS failing a handful of tests
> > >
> > >    This is the ghostscript version discrepancy we discussed earlier.
> >
> > Yes, but I'm still unclear about if this is something you'll need to
> > fix, or if it's something I need to do?  If so, what?
> 
> If you want the tests to pass then you'll need to install the version
> of ghostscript we used to generate the reference images, (ESP
> Ghostscript 8.15.2). I believe this is the version that comes
> naturally with Debian unstable and recent Fedora installations. I
> don't know anything about gentoo.
>
> If you can make an argument for settling on a different version of
> ghostscript, then we can re-generate the reference images with that
> and get the cairo developers to all switch. But let's try changing
> your end first.

Nope, no reason to make everyone else change for this.  We can
definitely get all the machines pinned down to this version, I just
needed to know that this is what must be done.  8.15.2 is marked
unstable for gentoo so by default it puts 8.15.1 on, except that for
some reason, on a few systems it was preferring to put ghostscript-gnu
on instead.  But it looks like I can force all of them to run 8.15.2.

Also, while it's not necessary for this case, we also have in crucible
the way to do post-processing of test results on the test driver.  This
way, if there is some analysis or reporting tool to run on the results,
but that may not work identically on any arbitrary system (such as if
you wished to use a tool that only works on x86), then the
post-processing can be centralized there.  This feature is really
designed for doing comparisons across test runs (e.g., graphing if
today's changes make things faster than yesterday's, or diffing results
from several different platforms.)

> > > 3) Some systems fail every test using text
> >
> > All of the gentoo systems have ttf-bitstream-vera installed now,
> 
> This is definitely a required font for the tests. And very recently
> Behdad made two tests, (ft-text-vertical-layout-*), also require
> "Nimbus Sans L". It turns out that even Behdad and I didn't get our
> two systems to agree on the exact glyph shapes associated with that
> font name. Our plan to fix this going forward is to bundle all
> required fonts directly with the test suite, so that should help
> eliminate this class of problems.

This is a very good idea.

More later...

Bryce