[cairo] Automated testing of Cairo

Fri Aug 18 03:30:58 PDT 2006

On Fri, 18 Aug 2006 02:13:20 -0700, Bryce Harrington wrote:
> Well, technically the git clone is sitting on one machine (the test
> driver), and the errors occur on all of the other systems that are
> trying to apply the patch the driver produces.

Sure. But you must have some mechanism of copying files from one to
the other, (since you're currently copying a patch from one to the
other somehow).

>                                                But yes, probably if we
> were using tarballs of cairo instead of patches, this problem wouldn't
> be there.  (This is what I've been doing with Inkscape, for instance).

> [After investigation it turned out that the problem was that patches
> made via git diff aren't guaranteed to apply to the official cairo
> tarballs, because the tarballs contain generated files not stored in
> git.]

The tar files do contain generated files not in git, (configure for
example), but that actually only broke the couple of systems of yours
that didn't have autoconf installed.

What led to to the failed patch was the reverse problem. That there
are files in git that aren't in the released tar files, (ROADMAP was
the one in particular that caused a problem). Bryce fixed this by
generating a tar file with "git tar-tree" and using that rather than
using a released cairo tar file.

It would be nice to fix this problem in general. The problem with
trying to get all of the files into the automake-generated tar files,
(which is what we're currently using for releases), is the pain of
manually maintaining the EXTRA_DIST variable to list them all. This is
extremely fragile and error prone and is very likely to break many
times in the future. (As nothing would indicate that we were missing a
file in a released tar file until potentially much later after the
release when the missing file was first modified and crucible would
fall over with a failed patch.)

A git-generated tar file doesn't have any missing source, but it also
doesn't have a configure script in it, so it's not quite ready for use
as a release.

For now, it probably makes sense for us to continue to use
automake-generated tar files for releases. And then just be sure to
use something other than those for the use of crucible, (either

> To be honest, I'm not completely convinced about this, but only because
> there's just too many things I don't know about git.  For you, git is
> known and crucible is new, but for me its the opposite.

Sure. So just take all that I said as me thinking out loud about one
way to design the system. And in the future when you are more
comfortable with git and are thinking about system design things, then
you can think back on what we've talked about here.

> > 1) SVG failing many tests

The SVG failures are pretty much gone now with the fix from
Behdad. The only exception is nfs11, so I don't know yet what is still
missing there.

> > 2) PS failing a handful of tests
> >
> >    This is the ghostscript version discrepancy we discussed earlier.
>
> Yes, but I'm still unclear about if this is something you'll need to
> fix, or if it's something I need to do?  If so, what?

If you want the tests to pass then you'll need to install the version
of ghostscript we used to generate the reference images, (ESP
Ghostscript 8.15.2). I believe this is the version that comes
naturally with Debian unstable and recent Fedora installations. I
don't know anything about gentoo.

If you can make an argument for settling on a different version of
ghostscript, then we can re-generate the reference images with that
and get the cairo developers to all switch. But let's try changing
your end first.

> > 3) Some systems fail every test using text
>
> All of the gentoo systems have ttf-bitstream-vera installed now,

This is definitely a required font for the tests. And very recently
Behdad made two tests, (ft-text-vertical-layout-*), also require
"Nimbus Sans L". It turns out that even Behdad and I didn't get our
two systems to agree on the exact glyph shapes associated with that
font name. Our plan to fix this going forward is to bundle all
required fonts directly with the test suite, so that should help
eliminate this class of problems.

> Regarding freetype, most systems have 2.1.9 installed.  2.1.10 is
> available in gentoo unstable but I have not yet upgraded the systems to
> that.  Would you like me to try that?

I've got freetype 2.2.1 here. But we do go out of our way to try to
avoid as much as possible with regard to font rendering that would
show version-specific deviation, (we turn off all font hinting,
etc.). For example, I'm pretty sure I upgraded freetype not too long
ago and I don't think I saw any changes in the text test output.

Oh! Looking closer at the test results I don't think there's a font or
freetype version involved at all. If there were the failures would
affect all backends (including image backend) and the failures would
look like just a bit of noise around the edges of the images. Instead
what I'm seeing is failures only in the PS and SVG backends, and text
not appearing at all or else appearing as single-pixel dots. So the
next question to ask is:

	Is cairo creating correct .ps and .svg output and it's just
	the tools we are using (gs and svg2png_librsvg) to render it
	to a .png that are broken? Or is this a bug in cairo that is
	exposed on the systems you are using but not the ones we
	usually test cairo on for some reason?

Bryce, you might even be of some direct help in chasing these
down. You could pick a text test failure out of an SVG column and go
look at the -svg-argb32-out.svg file and see if it looks like it
should render to the same result as the -ref.png image. If not, then
we've identified a bug in cairo.

> One thing I'd suggest is if a test fails, that the testsuite insert some
> sort of "error" image.

The test suite should be doing this...

> http://crucible.osdl.org/runs/1539/test_output/cairo-test/amd01/test/
>
> The broken image links don't really communicate that the test itself is
> failing; instead you wonder if something didn't get copied to the
> webserver correctly.

I think something very much like that is happening. For example, one
of the broken links on that page is to a missing image named:

	glyph-cache-pressure-svg-argb32-diff.png

Meanwhile, by poking around on amd01 I did notice that there's a file
of that name here:

/usr/src/cairo-1.2.2-g6122cc85/test/glyph-cache-pressure-svg-argb32-diff.png

So maybe the image-copying script is picking up the -ref and -out
images but missing the -diff images or something?

[snip useful insights on testing in new ways that the developers
haven't been using]

> One thing that would help me a lot, is if you could help identify where
> any of this testing has tangibly helped to improve Cairo.

I'll certainly do so as we move forward. And I think we've gotten rid
of enough false positives now that were starting to expose some real
bugs. There's the PS/SVG text stuff I mentioned above.

Then there are the extra non-text failures on ppc01 for example. You
can ignore long-lines since it is expected to fail, (and should report
XFAIL in your next update). But the SVG result of mask-ctm on ppc01
looks like a real bug. That will be another useful one to look at closer.

> An ideal situation would be if we could report that due to this testing,
> that Mozilla (or other key Cairo-based app) user experience is N% better
> (in terms of bugs, performance, etc.) than it would have been without
> the testing.

I think we'll get a lot of this desired bang as we get some automated
performance tracking of cairo going at OSDL. When you can show pretty
charts showing the performance of cairo improving that will translate
into direct very significant improvements for mozilla, OpenOffice.org
and other applications, (where there are quality benefits to switching
to cairo now, but performance penalties). Also, I have a very good
feeling that the existence of these automated performance tracking
things will play an important role in motivating the effort needed to
make these performance improvements happen.

So I'll keep working away at getting the cairo performance test suite
started.

I'm hoping to have _something_ at least available before I leave on
vacation tomorrow night, so that you can start trying to find a way to
hook it up while I'm out for a week.

-Carl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20060818/79d38a3e/attachment.pgp