[cairo] Transform optimization

Wed Nov 5 18:53:25 PST 2008

Good find André, I had noticed the exact same thing a while back.

I had sent a patch similar to yours, I've been wanting to clean it up and
resend it, but I'm bogged down in Win32 build problems.

However I had spent a good amount of time benchmarking my patch in various
conditions. Please have a look at the mail I sent on April 23rd:
http://lists.cairographics.org/archives/cairo/2008-April/013905.html

I'm attaching here my original patch and the benchmark numbers I had. Jeff,
Soeren had already expressed the concern that minifying would be slowed
down, but I didn't observe that in my tests. The worst case was a 14%
performance decrease, compared to up to 3X performance increases. Most of
the minification tests showed no difference. Furthermore, if you have
bilinear filtering enabled, you gain even more from this optimization as the
likeliness of hitting the same pixels is higher.

Cheers,
Antoine

> -----Original Message-----
> From: cairo-bounces at cairographics.org 
> [mailto:cairo-bounces at cairographics.org] On Behalf Of Jeff Muizelaar
> Sent: Wednesday, November 05, 2008 7:11 PM
> To: André Tupinambá
> Cc: Cairo mailing list
> Subject: Re: [cairo] Transform optimization
> 
> On Wed, Nov 05, 2008 at 09:27:17PM -0200, André Tupinambá wrote:
> > Hi everyone,
> > 
> > I'm searching for some opportunities to optimize other 
> pieces of code,
> > and I'm working now with the transformation code. 
> 
> Thanks for looking into this; the transformation code could definitely
> use some work.
> 
> > Checking with VTune, I saw that a great hotspot at fetching 
> the pixel
> > call (do_fetch). So I tried to reduce this fetch, checking if I just
> > read this pixel before, and works well for magnifying (about 2x in
> > Core2 and 1.66x in Turion).
> 
> What about when minifying? It seems like this patch would 
> cause a slow down
> because the fetches are always to different pixels.
> 
> It would also be good to know why fetches are so slow. In theory it
> should only be a cached memory access which should be pretty quick.
> However, I can see the advantage of avoiding fetching when we are up
> scaling a large amount since the samples don't change for an entire
> region of destination pixels. But I don't really like the 
> idea of adding
> more code to the inside of a inner loop, certainly not 
> without some more
> performance numbers for scaling to different sizes. It would also be
> good to have some idea about what the cost of a fetch is, so 
> that we know
> how important they are to avoid.
> 
> Further, I wonder if a more implicit approach would work better. If we
> could do a better job knowing when we need to read a new sample
> we wouldn't need to test for it every time. Something like:
> 
> while (dest < dest_end) {
>    compute_src_pixel_location();
>    fetch_src_pixels()
>    while (src_pixels_the_same) {
>      dest = compute_dest_pixel(src_pixels);
>      dest++;
>    }
> }
> 
> instead of your patch which doesn't something more like:
> 
> while (dest < dest_end) {
> 	compute_src_pixel_location();
> 	if (!src_pixels_the_same) {
> 		fetch_src_pixels()
> 	}
> 	dest = compute_dest_pixels(src_pixels);
> 	dest++;
> }
> 
> -Jeff
> _______________________________________________
> cairo mailing list
> cairo at cairographics.org
> http://lists.cairographics.org/mailman/listinfo/cairo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.cairographics.org/archives/cairo/attachments/20081105/6a5c2d7a/attachment-0001.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-Optimization-to-reduce-fetching-calls-in-scaled-up-o.patch
Type: application/octet-stream
Size: 0 bytes
Desc: not available
Url : http://lists.cairographics.org/archives/cairo/attachments/20081105/6a5c2d7a/attachment-0001.obj