[cairo] [PATCH] pixman: C fast path for add_1000_1000 and over_n_1_8888

Siarhei Siamashka siarhei.siamashka at gmail.com
Sun Nov 8 17:41:24 PST 2009

On Sunday 08 November 2009, Chris Wilson wrote:
> Hi Siarhei,
> 	I was just worrying about the absence of such paths from the
> current set of cairo-traces. The only attempt I've made at capturing a
> wide range of fonts and languages are the gnome-terminal and original
> firefox traces. I suspect that these and my fontsets do not accurately
> reflect your usage at all (and so my profiling is woefully myopic).
> xfce4-terminal with a recent vte will use cairo for its rendering so
> should generate a good trace, as will firefox and other gtk+
> applications. Could you record some sample cairo-traces so that we can
> see how much impact the addition of pixman fast paths makes to your
> workflow, and so that we do not neglect you when developing the other
> backends as well?

The use of these functions actually depends on font. I noticed that
performance was quite bad when using bitmap fonts such as terminus:

It's probably not a very important case for most users, though I myself
prefer to use bitmap fonts in terminals. But it just shows exceptionally
bad performance here unless pixman has the needed fast path functions.

Here is the trace (scrolling 'man gcc' in xfce4-terminal with terminus font,
16bpp desktop, ARM cpu):

Actually after upgrading cairo and some of the other libraries, now I get a 
bit different behavior from what I have seen before. This is a log from
oprofile for Xorg process with current pixman git:

samples  %        image name               symbol name
13296    29.1528  libpixman-1.so.0.17.1    combine_over_u
6452     14.1466  libpixman-1.so.0.17.1    fetch_scanline_r5g6b5
5516     12.0944  libpixman-1.so.0.17.1    fetch_scanline_a1
2273      4.9838  libpixman-1.so.0.17.1    store_scanline_r5g6b5
1741      3.8173  libpixman-1.so.0.17.1    fast_composite_add_1000_1000
1718      3.7669  libc-2.9.so              memcpy
1176      2.5785  libpixman-1.so.0.17.1    arm_neon_fill
1114      2.4426  vmlinux                  __memzero
951       2.0852  libpixman-1.so.0.17.1    bits_image_fetch_solid_32
640       1.4033  libpixman-1.so.0.17.1    _pixman_run_fast_path
513       1.1248  libc-2.9.so              _int_malloc
447       0.9801  libpixman-1.so.0.17.1    
377       0.8266  libc-2.9.so              malloc
350       0.7674  libfb.so                 image_from_pict
321       0.7038  libc-2.9.so              _int_free
307       0.6731  vmlinux                  __do_softirq
293       0.6424  Xorg                     miGlyphs
270       0.5920  Xorg                     CompositePicture
210       0.4604  libc-2.9.so              free
204       0.4473  libfb.so                 fbComposite

It clearly shows that now 'over_n_1_0565' is also badly needed for this use
case. Earlier only 'over_n_1_8888' was called and then the result was
converted to 0565 as an additional step (which was bad itself, but represented
a separate problem which seems to be solved now).

Still 'over_n_1_8888' fast path is also useful for 32bpp desktop. Like PS3,
which I'm using for testing big endian compatibility.

I'll post some more benchmarks for this 1-bit stuff later.

Best regards,
Siarhei Siamashka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://lists.cairographics.org/archives/cairo/attachments/20091109/7281633b/attachment.pgp 

More information about the cairo mailing list