[cairo] Pixman refactoring, ARM and Altivec implementations needed
siarhei.siamashka at nokia.com
Sat May 30 08:14:02 PDT 2009
On Saturday 30 May 2009 01:46:28 ext Koen Kooi wrote:
> On 29-05-09 22:16, Siarhei Siamashka wrote:
> > Would it be a good idea to compile pixman-arm-simd.c with '-march=armv6',
> > ignoring CFLAGS completely?
> Ignoring CFLAGS would be bad, since you might want for force an ABI that
> isn't in the default compiler spec (like armv6+ isn't in yours).
Thanks for joining the discussion. I hope that you can provide some useful
input regarding the subject.
Yes, I also have some worries about potential ABI related issues, that
was the part of my post which you decided was not worth quoting.
We may have PIC and TLS stuff, EABI/OABI, etc.
> Having said that, if armv6+ isn't in the compiler spec, how did armv6
> simd get enabled in the first place?
The problem is that armv6 simd is *not* getting enabled in this case.
I'll try to make it a bit more clear. Basically in pixman all the cpu specific
optimizations are isolated into their own source files and these source
files are compiled with gcc flags that are different from the rest of the
library. Basically gcc flags are the same, but have extra flags added
like "-mmmx -Winline" (for MMX), "-mmmx -msse2 -Winline" (for
SSE2), "-maltivec -mabi=altivec" (for altivec)
and "-mfpu=neon -mfloat-abi=softfp" (for ARM NEON).
Support for ARMv6 optimizations is different in pixman, no extra flags get
added for compiling 'pixman-arm-simd.c' at all.
The responsibility of configure script is to check if the compiler can support
these extra cpu extensions. So if you have a very old toolchain (not
supporting NEON for example), a small test snippet of code will fail to
compile in configure script and the support for these extensions will not be
compiled in. But if your compiler is recent enough, support for NEON can be
compiled as part of pixman even if your main target is some older cpu core.
NEON optimizations will be only used if NEON support is detected at runtime.
NEON optimizations currently imply that ARMv6 optimizations are supported too
and contain references top ARMv6 code. So the configuration when either you
artificially disable ARMv6 optimizations but enable NEON, or this
configuration is selected automatically (having toolchain tuned for
armv4/armv5te/whatever else in gcc specs).
Things to try may be:
1. try adding something like "-mcpu=arm1136jf-s" to gcc flags (it seems to
be able to override -march in current versions of gcc).
2. add some hack to 'pixman-arm-simd.c' like the line "asm(".arch armv6")" to
the very begginning of it, and use the same hack in configure code snippet
3. Implement ARMv6 optimizations fully in assembly and just use gas without
having to deal with gcc inline assembly woes
Actually 3. may be not so bad idea. The problem with ARMv6 is that it uses
standard ARM registers for data and can easily run out of them, the number of
available registers for inline assembly is unpredictable. Some of my ARMv6
optimizations use inline assembly, but are in fact complete assembly
implementations (using 'naked' attribute) to have full control over registers
allocation. This is kind of error prone, because I have seen a version of gcc
which can miscompile 'naked' functions when building code
with -fno-omit-frame-pointers option (it just screws up 'naked' functions
by inserting function prologue/epilogue instruction sequences).
ARMv6 is the only problematic one, because all the other media extensions have
their own separate registers and inline assembly is just fine for them.
(Constructive) feedback is very much welcome.
That said, I'm more interested in NEON optimizations at the moment.
Additionally, I myself always compile pixman with cortex-a8 optimizations for
the whole library.
More information about the cairo