[cairo] [PATCH] SSE2 support for pixman (v2)
andrelrt at gmail.com
Mon Mar 17 07:57:52 PDT 2008
> Did you see why there are some big performance regressions between
> perf-mmx-base-run4 and perf-sse2-run4?
> With cairo-perf-diff there are a few cases that are quite serious:
Do you want to see something quite curious? Try to compare
perf-mmx-base-run1 and perf-mmx-base-run3 :)
I ran this 4 perf one after another, but (I don't know why) there
always some differences. I got some difference about 1.9x speedup or
slowdown with the same code. Since I not finished all SSE2 code yet,
maybe this test ran the MMX code.
BTW: Now I'm trying to finish all code first, after that I'll run a
profiler (VTune) to look at bottlenecks in the code.
> I have a few observations about your patch:
> Introducing whitespace noise is not very desirable.
Sorry about that. I'm trying to not modify whitespaces, but
sometimes... I think that git-diff have a "ignore whitespace" flag,
I'll check this next time.
> Overall, I found that sse is not that much of a help for a Core 2 cpu, that
> can sustain the same memory bandwidth with mmx code. The same cannot be said
> for other models such as the P4, which gets a pretty good speedup.
It's sounds strange. The performance in Core2 machine should be
increased too. The MMX code loads a pixel, do the transformation and
save a pixel. The SSE2 code loads 4 pixels, do 4 transformation
sametime and save 4 pixel.
I'll reinstall the linux on my Core2 machine and run this tests too.
More information about the cairo