[cairo] Cairo/pixman behavior with large translations

Thu Dec 6 15:44:29 PST 2012

On Thu, 06 Dec 2012 13:51:29 +0100
sandmann at cs.au.dk (Søren Sandmann) wrote:

> Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
> 
> > The main culprits are functions pixman_transform_point_3d() and
> > pixman_transform_point() here:
> >     http://cgit.freedesktop.org/pixman/tree/pixman/pixman-matrix.c?id=pixman-0.28.0#n49
> > They perform multiplication of a matrix (16.16 fixed point) with a
> > vector (16.16 fixed point), to get a vector with 16.16 fixed point
> > values. This code can be upgraded to perform multiplication of the same
> > 16.16 fixed point matrix, but use something like 31.16 fixed point
> > for input vectors and get results in the 48.16 fixed point output
> > vectors. The caller then should be able to deal with the 48.16
> > results depending on the type of repeat set for the image. One
> > example is here:
> >     http://cgit.freedesktop.org/pixman/tree/pixman/pixman-inlines.h?id=pixman-0.28.0#n417
> > In this code, the "src_x" and "src_y" arguments are originally coming
> > from pixman_image_composite32() function and are already supposed to be
> > larger than 16 bits after the following commit:
> >     http://cgit.freedesktop.org/pixman/commit/?id=e841c556d59ca0aa6d86eaf6dbf061ae0f4287de
> > If they are getting converted to fixed point, we already get something
> > like 32.16 fixed point values which are asking for a larger vector
> > argument for pixman_transform_point_3d() function (though we might
> > want not to allow the use of full 32-bit range for "src_x" and "src_y"
> > in order to keep some headroom and safeguard against overflows). In any
> > case, immediately after pixman_transform_point_3d() we are
> > tweaking v.vector[0] and v.vector[1] according to the repeat type:
> >     http://cgit.freedesktop.org/pixman/tree/pixman/pixman-inlines.h?id=pixman-0.28.0#n432
> > Updating this code ("repeat" and "pad_repeat_get_scanline_bounds"
> > functions) to deal with 48.16 fixed point values from the vector can't
> > be too hard.
> 
> There is really a lot to be said for this 31.16 format. Intermediate
> results from matrix and vector multiplications fit in 64 bits, and image
> access coordinates up to +/- 1 billion pixels can be supported. Yet it
> is still a fixed-point format so there won't be any weird effects where
> positions are rounded differently depending on where they are on the
> screen.  If 64 bit coordinates become a performance problem, a new flag,
> "FAST_PATH_16_16_IS_ENOUGH" or something, could be added so that fast
> paths could use 32 bit coordinates. (We used to have something like this
> before pixman started to just return without compositing when it detects
> overflow).
> 
> It's very tempting to switch pixman over to using this format
> internally.
>
> The one thing I'm not sure of is whether this bug:
> 
>     https://bugs.freedesktop.org/show_bug.cgi?id=15355
> 
>     "Under a non-affine transform, or, in fact, any transform where the
>      'w' component of the (u,v,w) homogeneous source coordinate is not 1,
>      the representation of u,v,w as 16.16 fixed point numbers will
>      over/under flow on a regular basis even given fairly mundane
>      transformations (like a simple keystone correction). I've fixed the
>      intel driver to do these computations in floating point; I'm not
>      sure we want to try to use fixed point here."
> 
> is adequately taken care off by using this format.

Thanks for the link to this bug. Indeed, looks like this case is
not going to be handled well by returning just 48.16 results from
"matrix x vector" multiplication.

We need to keep all the fractional bits of the results. Which means
48.32 fixed point format. The shift by 16 in pixman_transform_point_3d()
function causes the loss of precious precision, which is needed to
ensure that the divisions by "w" provide reasonably accurate results in
bits_image_fetch_general() function:
    http://cgit.freedesktop.org/pixman/tree/pixman/pixman-bits-image.c?id=pixman-0.28.0#n552

Looks like the projective transformation just may need their own higher
precision variant of pixman_transform_point_*() function.

The performance of pixman_transform_point_3d() should not generally
be a big issue, because it is called a fixed number of times per
composite operation (and sometimes additionally once per scanline).
We can have any arbitrary precision for the results by using the
elementary school multiplication & division algorithms:
    http://en.wikipedia.org/wiki/Multiplication_algorithm#Long_multiplication
    http://en.wikipedia.org/wiki/Long_division

The 48.16 fixed point format is just good because it is supposed
to be convenient for the callers (as it fits into int64_t). But we
may return result vectors with the elements represented as arrays
or structs for projective transformations and have more than 64 bits
if needed.

> It's worth pointing out that even if there are useful non-affine
> transformations that still overflow, a homogeneous transformation matrix
> doesn't change if multiplied or divided by an arbitrary constant, so if
> worse comes to worst, we could drop precision by rounding the
> overflowing result down to 47 bits and then dividing all the other
> entries by a similar amount, in effect faking a floating point type.

Yes, this may work. It is also interesting to look at the bigger
picture and check how to optimize the precision and performance of
"x0" and "y0" calculation in the following loop from
bits_image_fetch_general():

    x0 = ((pixman_fixed_48_16_t)x << 16) / w;
    y0 = ((pixman_fixed_48_16_t)y << 16) / w;
    x += ux;
    y += uy;
    w += uw;

Looks like this is asking for using a reciprocal of "w", updated with
Newton-Raphson or something like this. And if the performance for
projective transformations is important, then it may make sense
converting these calculations to floating point. Wide integer
multiplications are not particularly fast.

-- 
Best regards,
Siarhei Siamashka