[cairo] [PATCH] Rely less on fast double-precision FPUs

Jonathan Morton jonathan.morton at movial.com
Tue Jun 1 10:59:08 PDT 2010

ARM CPUs have a big weakness in double-precision floating point.  In
particular, DP FP compares are ridiculously slow, mostly because the
result of the compare has to be transferred from the FPU to the CPU
using a serialising instruction.

For example, I estimate that _cairo_matrix_is_identity() - which is just
six chained comparisons - takes about 150 cycles on a Cortex-A8.

The attached patch massages a couple of the matrix property-testing
functions so that they don't need to use the FPU to do their job.  These
two functions stand out heavily in the profile of a simple "toy text"
exercise, and essentially disappear from it with this patch.  I expect
it may have a measurable effect on major text-intensive applications.

The Cortex-A9 is probably less affected by this problem, as it has an
improved FPU, but there are a *lot* of A8s and older versions out there.

I'd like people to check whether this change is also beneficial - or at
least not grossly the opposite - on x86, AMD64 and PowerPC.

From: Jonathan Morton
      jonathan.morton at movial.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Rely-less-on-DP-FPU-for-common-matrix-test-funcs.patch
Type: text/x-patch
Size: 993 bytes
Desc: not available
URL: <http://lists.cairographics.org/archives/cairo/attachments/20100601/452dd59a/attachment.bin>

More information about the cairo mailing list