[cairo] Automated testing of Cairo

Wed Aug 16 13:00:54 PDT 2006

On Wed, Aug 16, 2006 at 02:00:12AM -0700, Carl Worth wrote:
> On Thu, 10 Aug 2006 15:33:22 -0700, Bryce Harrington wrote:
> > > 	1.2.3-c3c7068 (diff)
> > >
> > >    where the "1.2.3-c3c7068" links into gitweb and the "diff" is a
> > >    link to the diff you currently provide.
> >
> > Hmm, I may have to punt on this one.
> 
> I'm a bit confused here as the current result I see for:
> 
> 	2006-08-10 cairo-1.2.2
> 	c3c7068 master (diff)
> 
> seems exactly like what I was asking for. So thanks!

Ah cool.  The bit I was unsure about was making it report 1.2.3 instead
of 1.2.2, but if you're okay with it showing 1.2.2 in the above, that's
great.  :-)

> > > 3) The results all say "OK" now regardless of what failures
> > >    exist. We're going to need to make that say something very
> > >    different than "OK" for failures if this is going to be useful. ;-)
> >
> > Yes!  This was something I wanted to bring up.  Can you give me some
> > sort of heuristic to use for differentiating between OK and BAD runs?
> 
> The most reliable indication of pass/failure is the return value of
> each individual test, and that return value is already being examined
> by the program that runs when you call "make test" or "make check" and
> it prints PASS, FAIL, XPASS, or XFAIL as appropriate.
> 
> So the most reliable thing to look at is the output of "make test" or
> "make check" looking for any lines beginning with FAIL. If you get any
> of those, then the test run is not OK, (assuming we fix all the false
> positives first---more on that below).

Okay, I've added greps for these.  I also added the count of fails
detected (I experimented with showing counts for xfail and fail
separately, but that got cluttered and confusing.)

  http://crucible.osdl.org/runs/cairo_branches.html

The numbers shown are (fails / fails+passes).  I didn't want to show the
fails without the totals because then it'd look like Cairo is
regressing; 69 fails in 1.2.2 compared with 10 in 1.0.0 sounds bad until
you realize there were only 110 tests in 1.0.0, and now there are 735. ;-)

If you have other ideas on how to make the display more immediately
useful, let me know and I'll see what I can do.

> > Okay; give me a listing of the prerequisites to install in order to get
> > all these things turned on, and I'll get it set up.  (We'll need to
> > update the images for the various machines, so it's easiest if we can do
> > all the deps at once.)
> 
> For the SVG and PDF backends you'll want rather recent versions of
> librsvg and poppler. I'm afraid I don't know precisely which versions
> are necessary to get no test failures. I believe most of the reference
> images have been generated with both librsvg and poppler from recent
> cvs checkouts, while other packages (such as ghostscript) have come
> from Debian unstable or various versions of Fedora.
> 
> Other things that are necessary are freetype, (recent versions?), and
> the Bitstream Vera font.
> 
> We've definitely done a very poor job of documenting all of these
> dependencies and version that are required for getting the magic "all
> tests give expected results" state. Hopefully, by going through a
> round of this with you we can generate that list more precisely.

Well, here is what's in gentoo for x86 stable:

# USE="X" emerge cairo librsvg poppler ghostscript-esp freetype ttf-bitstream-vera
 x11-libs/cairo-1.0.4 [1.0.2] USE="X* png -doc -glitz"
 x11-libs/pango-1.10.3  USE="-debug -doc"
 dev-libs/atk-1.10.3  USE="-debug -doc -static"
 x11-misc/shared-mime-info-0.16
 x11-libs/gtk+-2.8.12  USE="jpeg -debug -doc -tiff -xinerama"
 media-libs/libart_lgpl-2.3.17  USE="-debug"
 gnome-extra/libgsf-1.12.1  USE="-bzip2 -debug -doc -gnome -static"
 dev-libs/libcroco-0.6.0  USE="-debug"
 gnome-base/librsvg-2.12.7  USE="zlib -debug -doc -gnome -nsplugin"
 app-text/poppler-0.5.1-r1  USE="jpeg"
 app-text/ghostscript-esp-8.15.1_p20060430  USE="X* cups -cjk -emacs -gtk -threads -xml"
 media-libs/freetype-2.1.9-r1  USE="zlib -bindist -doc"
 media-fonts/ttf-bitstream-vera-1.10-r3  USE="X"

Let me know if these look like adequate versions.

> > nfs08:  GPL Ghostscript 8.50 (2005-12-31)
> >         Input formats: PostScript PostScriptLevel1 PostScriptLevel2
> >         PostScriptLevel3 PDF
> >         Default output device: x11
> > nfs09:  Not installed
> > nfs11:  ESP Ghostscript 8.15.2 (2006-04-19)
> >         Input formats: PostScript PostScriptLevel1 PostScriptLevel2
> >         PostScriptLevel3 PDF
> >         Default output device: bbox
> 
> OK. So my ghostscript is also "ESP Ghostscript 8.15.2 (2006-04-19)" so
> I guess that's currently the "magic" version for the sake of the test
> suite reference images. I can't make sense of the versions above to
> understand if that is newer or older than GPL Ghostscript 8.50.

I notice on some platforms, the ESP Ghostscript isn't available through
gentoo.  They have some virtual package thingee set up such that when
you 'emerge ghostscript', you might get the ESP version or the GNU
version.

> > > 6) The only current failure I see that isn't an obvious false positive
> > >    like those described above is the failure of
> > >    ft-text-vertical-layout which can be seen here:
> > >
> > > 	http://crucible.osdl.org/runs/1466/test_output/cairo-test/nfs11/test/
> ...
> > Hopefully this has the info you need:
> >
> >    http://crucible.osdl.org/runs/1466/sysinfo/nfs11.1/
> 
> So the big different variable there is "gentoo" rather than
> "debian". That could mean a lot of different things of
> course. Debugging this one might be easiest for me once I can get
> logged in to that machine and poke around.

You have access now; feel free to log in and poke around.  See the
tutorial at http://crucible.osdl.org for how to lock the machine if you
want to prevent it from running tests while you're investigating it.

> > Okay.  There's probably going to be a variety of different issues to
> > work through, but I've just now hooked them all up:
> >
> >   http://crucible.osdl.org/runs/cairo_branches.html
> >
> > amd01:  OK
> > ppc01:  Fails during configure
> > ita01:  Bunch of interesting issues during make check
> > nfs12:  OK
> 
> So it would be helpful to have more details on those problems. Got any
> now? Or maybe I can get them myself later.

Yes, if you click on the testrun ID you can see all of the logs in
detail.  Specific things to look at:

In the top directory are log files $sut.log, that give an overview of
what the machine did during testing.  This is a good place to look
first, to see where in the process things went awry.  Looking at
ppc01.log for example, you see that it failed during configure.

The logs/ directory contains the details.  For kernel stuff we break
these up into separate files for configure, make, make check, etc. but
for cairo its all in one file.  For convenience, I've now also
hyperlinked this log file from the cairo_branch.html page.

The sysinfo/ directory contains system level log and config files.  

The test_output/ directory contains the output from tests that are run.

If there are additional files/logs/instrumentation or whatever we aren't
capturing that would be helpful to have, let me know.

Bryce