[cairo] Transform optimization
Jeff Muizelaar
jeff at infidigm.net
Wed Nov 5 16:10:50 PST 2008
On Wed, Nov 05, 2008 at 09:27:17PM -0200, André Tupinambá wrote:
> Hi everyone,
>
> I'm searching for some opportunities to optimize other pieces of code,
> and I'm working now with the transformation code.
Thanks for looking into this; the transformation code could definitely
use some work.
> Checking with VTune, I saw that a great hotspot at fetching the pixel
> call (do_fetch). So I tried to reduce this fetch, checking if I just
> read this pixel before, and works well for magnifying (about 2x in
> Core2 and 1.66x in Turion).
What about when minifying? It seems like this patch would cause a slow down
because the fetches are always to different pixels.
It would also be good to know why fetches are so slow. In theory it
should only be a cached memory access which should be pretty quick.
However, I can see the advantage of avoiding fetching when we are up
scaling a large amount since the samples don't change for an entire
region of destination pixels. But I don't really like the idea of adding
more code to the inside of a inner loop, certainly not without some more
performance numbers for scaling to different sizes. It would also be
good to have some idea about what the cost of a fetch is, so that we know
how important they are to avoid.
Further, I wonder if a more implicit approach would work better. If we
could do a better job knowing when we need to read a new sample
we wouldn't need to test for it every time. Something like:
while (dest < dest_end) {
compute_src_pixel_location();
fetch_src_pixels()
while (src_pixels_the_same) {
dest = compute_dest_pixel(src_pixels);
dest++;
}
}
instead of your patch which doesn't something more like:
while (dest < dest_end) {
compute_src_pixel_location();
if (!src_pixels_the_same) {
fetch_src_pixels()
}
dest = compute_dest_pixels(src_pixels);
dest++;
}
-Jeff
More information about the cairo
mailing list