[cairo] Performance of the refactored Pixman
Jonathan Morton
jonathan.morton at movial.com
Fri Jun 12 05:46:41 PDT 2009
I've had a unique opportunity to compare the performance of the
refactored Pixman with an older version, using an identical set of
blitters and other overhead improvements (some forward-ported, some
back-ported).
Simply put, the refactored Pixman is consistently slower.
The main reason for this, I believe, is the considerably larger
parameter block being passed up the call chain. An extra parameter has
been added to this standardised block, and several of the others have
been doubled in size. Because these parameters are on the stack, they
have to be copied for each call.
The hurt is particularly bad on small requests. Browsers can do a lot
of one-pixel trapezoids and glyph strings, the latter requiring a pixman
call for each individual glyph as well as for the whole string. The
extra overhead can therefore remove up to 40% of the performance,
compared to an un-refactored version with the same mallocectomies and
blitters.
Most of this is probably offset by efficiency improvements (such as
mallocectomies) actually introduced between the two versions, and I
believe the newer version will be easier to maintain, but it really
would be nice to avoid a running-to-stand-still syndrome.
My big suggestion is to collapse these huge parameter blocks into a
structure, which can then be passed by-reference up the chain. This
would reduce the call overhead to two parameters, which will fit in
registers and therefore do not necessarily have to be copied.
Along related but distinct lines, I'm greatly in favour of a dedicated
"overlappable, unscaled copy" function in Pixman for scrolling support.
The call chain overhead is utterly killing performance for XCopyArea at
the moment. Failing that, dedicated single-scanline get/put functions
would probably be an improvement, internally as well as externally.
--
------
From: Jonathan Morton
jonathan.morton at movial.com
More information about the cairo
mailing list