[cairo] Pixman glyph performance, and beyond!
chris at chris-wilson.co.uk
Thu Oct 22 14:13:40 PDT 2009
So I'm reviewing how cairo handles compositing, looking at how we may
drive cairo-gl more efficiently. As part of that process, I've had the
opportunity to remove some overhead from within cairo-image. However,
glyph composition still suffers from substantial overhead since every
glyph is composited separately.
firefox-talos-gfx on a slow Celeron 600MHz:
# Overhead Symbol
# ........ ......
23.76% [.] _pixman_run_fast_path
23.34% [.] sse2_composite_add_n_8888_8888_ca
11.82% [.] sse2_composite_over_n_8888_8888_ca
6.31% [.] pixman_image_composite
4.69% [.] walk_region_internal
4.44% [.] pixman_blt_sse2
3.18% [.] _pixman_image_validate
2.30% [.] sse2_composite_over_n_8_8888
2.23% [.] pixman_compute_composite_region32
2.19% [.] pixman_fill_sse2
1.91% [.] sse2_composite
(And to put it in perspective:
Søren has looked at this problem in the past and begun work on
fast-path and faster-fast-path branches, looking to cache prior
fast-path resolutions. These are not yet as effective as one would hope.
How insane would it be to push the get_fast_path() to the user and to be
able to pass in the implementation + composite function instead of
performing the search every time? This would also be useful for spans.
And considering how most cairo operations are first performed to a mask,
cairo could very effectively cache the fast path for its most frequent
I'm particular interested in suggestions and experiences from the
ARM/NEON guys as they seem to be suffering acutely from similar overheads
in pixman - and so I presume are also looking at this issue.
Chris Wilson, Intel Open Source Technology Centre
More information about the cairo