[cairo] Numerical troubles in cairo and singular matrices

M Joonas Pihlaja jpihlaja at cc.helsinki.fi
Tue Dec 19 19:00:13 PST 2006


Hi,

I've been recently working on bug 9276.

     https://bugs.freedesktop.org/show_bug.cgi?id=9276

In a nutshell, the bug shows that the miter join code in the
stroker sometimes gets the miter length test wrong and produces
really large polygons.  While we don't know yet what the root
cause of the bug is, it's brought to light some interesting cases
in cairo where numerical problems in cairo can cause some pretty
nasty things (even memory scribbling and subsequent segfaults.)

* Undefined casting

The old double -> fixed conversion in 1.2 using casts can produce
strange results that we're not handling very well.  In the case of the
bug, we're producing really large trapezoids that freeze up the
machine.  Quoting from the bug page:

> Apart from the fact that the user vector is potentially random garbage
> off the stack, one reason it is likely behaving differently on the x86
> is that the result of converting out of bounds float values all map to
> the same number 0x80000000.  I was shocked this was happening and had
> assumed it would have worked like the sparc does -- mapping out of
> bounds positive values of 0x7FFFFFFF and negative ones to 0x80000000.

Turns out that, according to C99, trying to convert out of bounds
floats to integer types is undefined.  Values derived from floats
shouldn't be used as array sizes or indices without extra logic to
make sure they're sane.  For example, when estimating the number of
pen vertices needed to satisfy a given tolerance, we're doing:

	num_vertices = ceil (M_PI / delta);

where delta is derived from the user/device space transform.  There's
no sanity checking on num_vertices however, so it's possible for the
user to force a segfault later when the allocation size of the vertex
array overflows.

* Unexpectedly small or large numbers

Next, really small numbers can cause infinite loops or performance
degeneracies to the point where a loop might as well be infinite.

For example, the stroker code has two parts where dashed strokes are
stepped by adding a dash lengths to dash offsets -- but if the dash
lengths are small enough to not ever actually have an effect on the
dash offset, we're stuck forever.  The places this happens in the
stroker are: initialising the dash in _cairo_stroker_start_dash(),
stepping the dash with _cairo_stroker_dash_step_dash(), and stepping
the dash in _cairo_stroker_line_to().  (Adding logic to check for the
truly infinite looping case isn't really enough of course, since small
numbers may advance the dashes by such a small amount that it might as
well be.  We should probably be doing something clever with the
tolerance value passed to the stroker, and the device space resolution
to be safe.)

Other problems with really small numbers are due to unintentional
underflow to zero.  There are a couple of places in cairo where
we're doing stuff like normalising vectors with sqrt(dx*dx +
dy*dy) when we "know" that the result should be nonzero, or
computing determinants which are "known" to be non-zero.  In bug
9276 this is causing the miter length test to fail.

It's also possible to cause infinities by dividing by really small
numbers, which in turn lead to NaNs when we try to do undefined things
with them.  While NaNs and infinities can be safe, we ought to be
thinking about them while writing the code.  I haven't done enough
digging to make sure that this is causing a concrete problem anywhere
though, but NaNs do have a tendency to break case analyses that work
for real numbers.  It's possible to inject NaNs or infinities into
cairo directly, or simply by passing small enough values in a
transformation matrix (the matrix inverter underflows, makes infs,
then makes nans...)

Since a lot of this stuff revolves around atypical (but possible) user
input, the easiest way around it is to simply pick a suitable range
for input parameters and check that everything is kosher.  This should
work for all the user space coordinates, but for transforms we need to
do make sure that the transform or its inverse isn't itself going out
of bounds.

* Singular matrices

Our handling of degenerate transformation matrices could do with some
love.  Currently a lot of the code calls any matrix that has
determinant exactly zero degenerate.  Numerically, this doesn't even
pretend to be a robust test. ;) So what happens is that for a lot of
operations everything works as if we're handling near singular
matrices "fine" (modulo the issues with Really Large and Really Small
numbers above), but then all of the sudden when the determinant
vanishes we simply give up in disgust.

For example, a really thin ellipse as a pen works great until its area
goes to zero (and then cairo shuts down), so making a thin elipse is
likely a good enough workaround for everyone who wants a pen with zero
area.  The trouble is, it's now the user who has to make the choice of
how near singular the matrix should be set, and then we're potentially
back in Really Small/Big number land again. :|

Ok, so going through the code that actually uses the matrices, there
seem to be two main operations that are implemented in a way that
requires a non-singular matrix: a) going back and forth between user
and device space, and b) finding out if the transform has a reflection
component when mapping user-to-device by checking the determinant's
sign.

For the first use, it would be fairly simple to replace the current
calls to matrix_multiply using the inverse, with a calls to a linear
solver that computes a least square error minimising solution.  The
best way to do this would likely involve some upfront decomposition of
the user-to-device transform matrix (say an some sort of
diagonalisation), but since we're only dealing with 2x2 matrices, it
should still be quite fast (2x2 not 3x3 because the translation
components don't affect the degeneracy of the matrix of course.)  We
can also do the decomposition lazily and cache values to offset the
costs.

If we do do diagonalisation/decomposition, we're also in a great
position to check that the matrix's singular values aren't too large
or small.  Too large ones we can simply reject, and too small ones we
can force to zero to make a properly singular matrix.

For the second use of finding out whether the matrix reflects or not,
we're currently evaluating the sign of the determinant of the matrix.
I think we really don't need to be doing this for singular matrices at
all because, at least in the one case I looked at inside the pen
computation code where we're doing this, the singular transform
collapses pens into lines.

However, if it turns out that we do need some consistent and
valid-as-possible value for the reflectivity of (nonzero) singular
matrices, we should likely be computing the sign of the product of the
non-zero eigenvalues of the matrix.  This would correspond to looking
at only the principal subspace of the transform.  If we're doing
eigenvalue decompositions anyway, this would work out great and give
us well defined values for the reflectivity, rather than the rather
arbitrary sign we're getting now from the determinant computation
(which should evaluate to zero, but is dominated by noise.)  There's a
small expected catch though: there are multiple paths from a full rank
matrix to the same singular matrix there the limit of the sign of the
determinant differ.

Finally, concerning singular matrices and the stroker in particular,
we should make sure that we're not generating degenerate output to any
of the backends, because we can't be sure that they deal with
singularities correctly.  For example, ghostscript sometimes produces
artifacts from filling polygons with empty subparts in them.  Even
within cairo, empty trapezoids break cairo_in_stroke() and/or
cairo_in_fill() because they're not testing for degenerate traps.

* Thoughts for the new matrix type.

There's been talk of a new internal matrix type in cairo that could
speed up the common case of the identity transform and possibly have
other flags for fastpathing operations.  I think in addition the new
matrix type should get rid of the need for tracking inverse matrices
explicitly, instead preferring functions to apply the transform
backwards and forwards.  It should also deal with singular matrices
robustly and have functions to query whether the transform is singular
or not, so that we can use that information to fastpath higher level
operations due to the area collapse.  An SVD on the matrix would
be ideal for this.

Cheers,

Joonas


More information about the cairo mailing list