[cairo] Transform optimization
Soeren Sandmann
sandmann at daimi.au.dk
Tue Mar 24 11:21:30 PDT 2009
Hi André,
> > Some general comments on the patch:
> >
> > * I would like to see comments that explain the various phases that
> > the code goes through. Just things like
> >
> > /* Zero fill the part of the scanline that is to the left of
> > * the image */
>
> Done, but could you please check the English?
It's generally easy enough to understand, but I'll fix up a couple of
grammar issues before committing it.
> > * Can the code that is computing the bilinear interpolation where one
> > of the columns is zero (the code for the left and right edge) be
> > factored out? Maybe it could even be reused in the general-purpose
> > bilinear code?
>
> I separate the edge functions and the code became more readable, but I
> can't use it in general bilinear fetch now. The main problem is to
> identify where is the edge of the image in general code. Today the
> check is in do_fetch code, we need to change it. I will think about
> how I can do that.
I think it's fine to have separate fetching code for the edges and the
main part of the image. What I meant was if we could avoid the
special-case code for bilinear_left/rightEdge.
Ie., I'd delete bilinear_leftEdge() and just call
bilinearInterpolation (0, tr, 0, br, distx, disty);
Similarly for bilinear_rightEdge().
If we make the bilinearInterpolation() functions force_inline, rather
than just inline, the compiler should be able to optimize it down to
what he special cases did.
> >> BTW, I installed a 32bits linux (in my old mobile P4) and ran the the
> >> cairo-perf with bilinearInterpolation optimized with uint64_t
> >> variables. And my worries was confirmed, this version is slower than
> >> the 32bits one. I don't know if keep this two versions is a problem,
> >> looks good for me.
> >
> > It's fine to have separate 64 and 32 bit versions of the code, but I
> > think we need a configure-time test for 64-bit here, so that the right
> > code is used on other 64 bit architectures than x86-64.
>
> Done, I also created another configure switch to disable the 64bits
> optimization.
Actually, I think this can be done in a simpler way by using
AC_CHECK_SIZEOF(long)
and then
#if SIZEOF_LONG > 4
...
#else
...
#endif
While it might be nice to be able to turn 64 bit optimizations on or
off, I don't think it's essential, and we can add it later if
necessary.
> Since the "mask" it's also a pointer, it could not be register
> allocated all over the time. And yes, this code is faster than the old
> one, but just a bit (something from 5% to 10%). It's too little to
> make a big change in all code, I think.
I mean the _expression_ *pMask will always cause a memory reference,
whereas in the old code
if (!mask || mask[i] & maskBits)
the variable "mask" could be register allocated, and if it was NULL,
no memory reference would take place. But it's not a huge issue.
The final comment I have is about the name
fpFetchBilinear32h_affine_repeatNone
I'd like to move towards less complicated names. So can we just call
this
fetch_bilinear_affine_repeat_none?
Thanks,
Soren
More information about the cairo
mailing list