[cairo] Pixman scaling performance

Fri Nov 27 07:27:48 PST 2009

On Friday 27 November 2009, Soeren Sandmann wrote:
> Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
> > It's mostly a question about the status of
> > http://cgit.freedesktop.org/~sandmann/pixman/log/?h=unroll
> > branch, which should improve performance of nearest scaling.
> >
> > If tested with render_bench program on ARM Cortex-A8, we get the
> > following results (it's mostly showing the performance of OVER
> > compositing operation with nearest or bilinear scaling).
> >
> > As expected, bilinear scaling improved a lot in git (almost reaching
> > imlib2), but nearest scaling is still a bit too slow.
> >
> > Both SRC and OVER operations are quite important for browsers.
>
> I was hoping that someone would take your old scaling fast paths and
>
>         - do them with preprocessor macros instead of duplicating them
>
>         - go through the fast path system on top of the 'flags' branch
>           instead of the additional if statements.
>
> But it may make sese to merge the unroll branch regardless.

Well, looks like we had a little bit of miscommunication then.

Generally I see three layers in pixman:

1. Fully functional simple implementations. Can be used as a reference
implementation and also in the cases which are not supported by fast paths.
2. Some portable fast path C implementations, which implement some subsets of
operations but have lower computational complexity and better performance for
the cases they can handle. Can be used on the platforms which do not have 
special hardware-specific optimizations and should preferably provide
sufficient performance not to lose badly to competing libraries :)
Also can be used for benchmarking platform-specific optimizations against
them.
3. Platform-specific optimizations which push performance up to the hardware
limits.

Automated testing can help to compare these alternative implementations to
each other. Tests should be sensitive enough to easily detect problems in
corner cases (like bugs related to '<=' vs. '<' operator usage in 'if'
expression, etc.). Tests themselves can be tested by trying to artificially
introduce bugs into code and checking whether they get caught.

Both your patch from 'unroll' branch and my older nearest neighbor scaling
patch belong to the second layer. They are more or less comparable to each
other. Your patch is more generic and supports more cases of clipping. My
patch was somewhat faster on ARM (with almost no performance difference on
x86).

In any case, I'm most interested in the third layer and ARM specific
optimizations (for scaling too). That's why I would like to see 'unroll'
branch merged so that we can move on. It's nice to see pixman performance
improving steadily.

-- 
Best regards,
Siarhei Siamashka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://lists.cairographics.org/archives/cairo/attachments/20091127/5f41c9b4/attachment.pgp