[cairo] Pixman refactoring, ARM and Altivec implementations needed

Jonathan Morton jonathan.morton at movial.com
Mon Jun 1 02:46:59 PDT 2009


> >> Having said that, if armv6+ isn't in the compiler spec, how did armv6
> >> simd get enabled in the first place?
> >
> > The problem is that armv6 simd is *not* getting enabled in this case.

This puzzles me.  The configure tests for ARM-SIMD and ARM-NEON are
actually rather similar, except that the SIMD one is done using an asm
fragment (so only relies on the assembler), while the NEON one relies on
an intrinsic (relying on both compiler and assembler support).

The only concrete difference is that the NEON fragment explicitly
enables a flag that turns on NEON support.  AFAIK the only way to do
that for the v6 SIMD is to specify one of the specifically v6-or-better
CPU models (or one of the v6 architectures), which strikes me as *very*
dirty.  I suppose it might be acceptable to force -march=armv6 for a
single file.

There's a side-effect there which wouldn't show up in the trunk, but
would show up with my object-pool optimisation, which also relies on v6
instructions but doesn't check for support at runtime.  Obviously the
object-pool is quite a large hammer for the problem I was trying to
solve.

Another option is to make the NEON layer not depend on SIMD.  This
would, however, presently have the side-effect of partially hiding a big
problem for users.

I can prepare a small patch that deals with the problem either way, if
somebody could express a preference.

> Same here. I do wonder how this all will work out for cortex-a9, where 
> ARM decided to make NEON a bit slower to make room for a full VFPv3. 
> Hopefully someone at ARM with access to real A9 silicon could speak
> up?

I don't have access to A9, but I would say that unless NEON is
*drastically* slower than previously, probably no pixman users will
notice.  Also, unless they've done something *completely* stupid with
the instruction timings (like making an add take longer than a multiply)
or the memory access semantics, few or no code changes should be
necessary.  I think that's a rather low risk.  :-)

Blitters tend to be bandwidth limited when written properly, and the
greater issue flexibility (proper OoO!) promised in A9 should ease any
pain a bit.  I would compare A9 quite directly with the PowerPC G4.

-- 
------
From: Jonathan Morton
      jonathan.morton at movial.com




More information about the cairo mailing list