[cairo] Transform optimization

Thu Nov 6 18:15:56 PST 2008

Hi André,

> I'm searching for some opportunities to optimize other pieces of code,
> and I'm working now with the transformation code. Checking with VTune,
> I saw that a great hotspot at fetching the pixel call (do_fetch). So I
> tried to reduce this fetch, checking if I just read this pixel before,
> and works well for magnifying (about 2x in Core2 and 1.66x in
> Turion).

It is certainly true that do_fetch() is a hotspot, and the main reason
is that it does an indirect call to the format specific pixel
fetcher. For bilinear filtering, this means four indirect functions
calls per pixel. 

Working around this problem with caching will help a bit, especially
for upscaling, but as Jeff notes, do_fetch() should just be a memory
read in the common case.

The better fix here is to write a bilinear fetcher that special cases
for the common case of

        - image has format x8r8g8b8 or a8r8g8b8
        - the transformation is affine
        - there is no source clip

Such a fetcher wouldn't have to do any indirect function calls and it
could avoid the check for perspective transformations altogether.

Another specialization for pure scaling transformations could take
advantage of coherence by caching the interpolation between the two
pixels on the last right edge.

The "fetch-refactor" branch of this repository:

        git://freedesktop.org/~sandmann/pixman

contains a refactoring of some of the transformation code that gets
rid of a lot of gratuitous code duplication. It will likely be useful
as a starting point for optimizing the transformation code.

I will probably merge that branch soon unless someone tells me that it
would interfere with what they are working on.

Soren