[cairo] Pixman - refactoring of fbFetchTransformed - performance issue
Bertram Felgenhauer
bertram.felgenhauer at googlemail.com
Tue Feb 19 13:54:09 PST 2008
Antoine Azar wrote:
> I'm very surprised at this 30% number. The refactoring shouldn't come
> anywhere close to that number.
I agree, and I was surprised as well. However, the result is
reproducible and the numbers are stable.
It appears that the compiler (gcc) does a worse job optimizing the
refactored code. One difference is that it needs to set up a larger
stack frame for fetchFromRegion() than it had to for fetch().
On the other hand, the additional indirect call is as cheap as possible;
the whole code of fbFetchFromNoRegion is
00005320 <fbFetchFromNoRegion>:
5320: 8b 4c 24 14 mov 0x14(%esp),%ecx
5324: ff e1 jmp *%ecx
and the jump should be predicted nicely.
I haven't looked at the differences in detail.
> I applied Bertram's patch for the perf test suite and ran just those
> mag/min tests before and after the refactoring, and I'm seeing 1-2%
> performance differences. You can see the results of both runs and the
> speedup/slowdown factor here: http://www.antoineazar.com/refactor.htm
What processor and which compiler (and options) are you using?
As an aside, it's odd that the difference between *_source and *_over
is so large in your tests, and that *_source turns out to be slower.
I wonder why.
My numbers are different, for example:
(before)
[ 9] image-rgba paint_image_rgba_mag_source-256 12614763 6.919 7.006 0.81% 8
[ 10] image-rgba paint_image_rgba_min_over-256 14194059 7.785 7.788 0.09% 4
(after)
[ 9] image-rgba paint_image_rgba_mag_source-256 16889333 9.264 9.267 0.02% 5
[ 10] image-rgba paint_image_rgba_min_over-256 18351320 10.066 10.086 0.51% 5
with hardly any difference between _source and _over.
> Could a third person run the perf before and after the refactoring to
> check the numbers?
Yes, that would help - the more numbers the better.
The relevant commits are in the pixman repository,
e95638c629334151e27633cc1c476ea582d766ec
(before the refactoring)
and
8d79c48126398aa7b31e9bb9e25af9d231075604
(after)
Of particular interest are the *_image_rgba_min_* and
*_image_rgba_mag_* tests in the performance testsuite (now in
cairo's git repository).
Bertram
More information about the cairo
mailing list