[cairo] Patches for speeding up _cairo_fixed_from_double

Daniel Amelang daniel.amelang at gmail.com
Mon Nov 6 15:28:44 PST 2006


On 11/6/06, Carl Worth <cworth at cworth.org> wrote:
> On Wed, 1 Nov 2006 22:53:33 -0800, "Daniel Amelang" wrote:
> > +  *)
> > +    m4_default([$3],
> > +      [AC_MSG_ERROR([
> > +
> > +Unknown float word ordering. You need to manually preset
> > +ax_cv_c_float_words_bigendian=no (or yes) according to your system.
>
> First, under what conditions is this path expected to execute? Missing
> grep I suppose? Rather than breaking the build here, should we perhaps
> just disable the optimization in this case?
>
> And the error message should provide the user with more direction,
> especially if it's going to break the build. For example, where and
> how would I "manually preset" this variable and value? How might I
> determine the correct value?

Hmmm...I was following the behavior of the built-in autoconf macro
AC_C_BIGENDIAN (that cairo already uses, BTW) which has this same
logic and uses the same technique (and a similar message). Regardless,
I think that there is only one scenario where it would fail (autoconf
requires grep, so absence of grep shouldn't be a reason to fail): the
platform doesn't use IEEE-754 to store its floats and we already
decided not to worry about those systems here:

http://lists.freedesktop.org/archives/cairo/2006-October/008287.html

So I vote to just break the build at that point and say "Cairo
requires floats to be stored in IEEE-754 format". If in the future,
some bizarre system appears that doesn't, and we want to support it,
we can go back and add a new check and a new flag
(IEEE754_FLOAT_FORMAT?) and add guards to the code that uses
FLOAT_WORDS_BIGENDIAN and provide a non-IEEE path.

> > + * 2) It doesn't function properly if the FPU is in single-precision
> > + *    mode.
>
> Is this really true? If so, this is a problem. People have encountered
> bugs with cairo in single-precision FPU mode before. For example, see:
>
>         _cairo_color_compute_shorts fails with FPU set to single precision
>         https://bugs.freedesktop.org/show_bug.cgi?id=7497
>
> So we've fixed bugs in this area before, and I'd like to not make any
> regressions here. It would be a nasty thing to check for since it's a
> run-time condition. But perhaps the current code works fine. Have you
> tested and verified failure in single-precision mode?

Yes. For those interested, here are the two lines it takes to put your
x86 into single precision mode (using gcc asm):

unsigned short single_prec_control_word = 4210;
__asm__ __volatile__ ("fldcw %0" : : "m" (single_prec_control_word));

I was aware of this problem (and the bug report), and had forgotten to
include my rationale for going ahead despite this in the original
email with the patch. I'll include that now:

1) AFAIK, there is only one use case for mixing cairo with
single-precision FPU mode: cairo + DirectX. It is very uncommon for
developers to switch the precision on their own.

2) Developers can easily prevent DirectX from interfering with the
precision mode by setting the D3DCREATE_FPU_PRESERVE flag when the D3D
device is created.

3) Other popular software APIs currently solve this problem by
requiring and documenting that single-precison not be used, and when
mixing with DirectX, the D3DCREATE_FPU_PRESERVE must be set. Examples
include:

Apple's Quicktime:
http://developer.apple.com/documentation/QuickTime/Conceptual/QT7-1_Update_Guide/Content/2NewFeaturesChangesa.html

Lua programming language:
http://www.lua.org/bugs.html (mentioned here not because it's a bug,
but because people think it is)

Microsoft's own WMP API:
http://windowssdk.msdn.microsoft.com/en-us/library/aa389082.aspx

IMHO, this optimization, and others forthcoming that use this
technique, are too valuable to pass up just so that DirectX+cairo
users don't have to remember to set a flag. So my proposal is to
clearly document that cairo doesn't function properly in
single-precision mode, explicitly instructing DirectX users to set the
D3DCREATE_FPU_PRESERVE flag.

The only alternatives I see are 1)leaving out the optimization or
2)leaving it out just on Win32, or 3)setting the precision mode
ourselves (in cairo_create for example). I'm not too thrilled about
any of those, but who am I to say.

> > +    srand (time (0));
>
> I really don't like seeing non-determinism in the performance test
> suite...I switched to a constant seed for srand in the commit I made.

Totally understandable, good move.

Dan


More information about the cairo mailing list