[cairo] New ARMv7-A (NEON) optimisations for Pixman
sandmann at daimi.au.dk
Wed May 6 18:12:51 PDT 2009
> We've tried to implement the optimisations in the same sort of way as
> existing Pixman code, to minimise integration problems - the goal having
> always been to contribute these optimisations upstream when they are
> ready. We have a series of patches against 0.15.2, starting with a
> framework for NEON support (based on Ian Rickard's work), then
> successively adding code paths.
> However we do also notice that there is a major refactoring effort going
> on, and so our code might need to be rearranged to match the new layout.
The refactoring will not touch the architecture specifc fast paths,
except for two things:
1. The FastPahInfo tables in pixman-pict.c will move into
their corresponding architecture files.
2. There will be reformatting and renaming of variables and
functions to get rid of the 'fb' prefix.
The first one should not be a real problem; I am fine with postponing
the second one until after the patches land.
> (For example, it looks like there's explicit support for NEON code there
> already.) Apparently there is some other NEON code floating around, so
> we might have to do some coordination to avoid too much duplication of
> effort. For the moment we have to consider 0.15.2 as the base version.
The existing NEON code was also based on Ian Rickard's work, but Jeff
would know more. Generally, please use git master as the base for
patches as much as possible.
> Unfortunately we have not had time to include intrinsic versions of the
> blitters, so the optimisations will only work on GCC. The build
> shouldn't break on armcc, as we added a specific autoconf test for
> gcc-inline-asm support (cleaner than #ifdef magic, we think), though we
> don't have a convenient way of testing this directly against armcc. The
> conversion to intrinsics should not be very difficult for an interested
> party to perform.
As long as the build doesn't break on armcc, there is no problem
having gcc-specific code in pixman.
> The optimisations cover straight fills, blended fills, straight copies,
> straight blits, format-converting blits (from xRGB8), ARGB8 compositing,
> and glyph (A8 * solid ARGB) rendering. We consider these operations to
> be the most common ones in practical applications.
> We've seen worthwhile performance improvements on the target hardware.
> In some typical cases, such as for glyph rendering, the bottleneck has
> been shifted from the blitter to the X server's overheads. In other
> cases, we are close to saturating the available memory bandwidth. We
> suspect that having the CPU and bus active for a shorter length of time
> should also save power, which is usually important on ARM-based devices.
> The first couple of patches are available essentially immediately, to
> get the ball rolling. The remaining patches in the series depend on our
> customer's approval, which will take time but not much effort. Of
> course knowing exactly where to send the patches would be helpful.
As Jeff said, feel free to send patches to the cairo or xorg-devel
lists. Even better is links to git repositories that can be pulled
More information about the cairo