[cairo] Transform optimization
André Tupinambá
andrelrt at gmail.com
Fri Mar 6 23:04:09 PST 2009
Hi Soeren,
My turn for the vacations! the Brazilian's carnaval just ended, so I
came back to work :)
Answers below (and a new patch).
Regards
André Tupinambá
On Wed, Feb 11, 2009 at 9:11 PM, Soeren Sandmann <sandmann at daimi.au.dk> wrote:
> Hi André,
>
> Sorry about the extended absence. I finally got some time to read
> through the patch.
>
> Overall, I like the approach of doing this in the general scanline
> fetcher since that provides the benefit for all the operator, and the
> bilinearInterpolation functions make the existing bilinear code much
> more readable.
>
> Some general comments on the patch:
>
> * I would like to see comments that explain the various phases that
> the code goes through. Just things like
>
> /* Zero fill the part of the scanline that is to the left of
> * the image */
Done, but could you please check the English?
>
> Or alternatively, the phases could be made separate functions,
> though this may complicate the code more than its worth.
>
> * Am I misreading the code, or is it missing a final zero-fill phase
> for the pixels that lie to the right of the image?
Oops... Problem solved :)
>
> * Can the code that is computing the bilinear interpolation where one
> of the columns is zero (the code for the left and right edge) be
> factored out? Maybe it could even be reused in the general-purpose
> bilinear code?
I separate the edge functions and the code became more readable, but I
can't use it in general bilinear fetch now. The main problem is to
identify where is the edge of the image in general code. Today the
check is in do_fetch code, we need to change it. I will think about
how I can do that.
>
>> BTW, I installed a 32bits linux (in my old mobile P4) and ran the the
>> cairo-perf with bilinearInterpolation optimized with uint64_t
>> variables. And my worries was confirmed, this version is slower than
>> the 32bits one. I don't know if keep this two versions is a problem,
>> looks good for me.
>
> It's fine to have separate 64 and 32 bit versions of the code, but I
> think we need a configure-time test for 64-bit here, so that the right
> code is used on other 64 bit architectures than x86-64.
Done, I also created another configure switch to disable the 64bits
optimization.
>
>
> Specific comments on the code:
>
>> + if (!mask)
>> + {
>> + maskInc = 0;
>> + maskBits = 1;
>> + pMask = &maskBits;
>> + }
>> + else
>> + {
>> + maskInc = 1;
>> + pMask = mask;
>> + }
>
> I'm not completely convinced that the
>
> if (*pMask & maskBits)
> ...;
>
> is a net win, since *pMask is now always a memory reference, whereas
> "mask" could be register allocated.
>
> But in any case this idea seems to be unrelated to the rest of the
> patch, so I'd like to see it happen in a separate patch, and then, if
> it actually is a win, do it for convolutions and source pictures as
> well, not just for fetchTransformed().
Since the "mask" it's also a pointer, it could not be register
allocated all over the time. And yes, this code is faster than the old
one, but just a bit (something from 5% to 10%). It's too little to
make a big change in all code, I think.
>
>> + if (vec0 < (int) 0xffff0000)
>> + {
>> + int points = ((0xffff0000 - vec0) / unit0) >> 16;
>
> We need a new #define pixman_fixed_minus_1 in pixman.h, which should
> then be used here.
Done.
>
>> + if ((pTop == zero) &&
>> + (pBottom == zero))
>> + {
>> + if (!mask)
>> + {
>> + memset (buffer, 0, width * sizeof(uint32_t));
>> + }
>> + else
>> + {
>> + for (i = 0; i<width; i++)
>> + {
>> + if (mask[i] & maskBits)
>> + buffer[i] = 0;
>> + }
>> + }
>> +
>> + return;
>> + }
>
> I think this should just be a memset(). There is no real point
> optimizing out memory writes, especially when the alternative involves
> the mask[i] memory reads.
Done.
>
>> + pictWidth += 0x00010000;
>
> 0x00010000 should be pixman_fixed_1 here.
>
Done.
>> }
>> }
>> + }
>> else if (pict->common.filter == PIXMAN_FILTER_CONVOLUTION)
>> {
>
> The code above this brace needs to be indented.
>
Done.
>
> Thanks,
> Soren
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pixman-transform.patch.bz2
Type: application/x-bzip2
Size: 4354 bytes
Desc: not available
Url : http://lists.cairographics.org/archives/cairo/attachments/20090307/42219523/attachment.bin
More information about the cairo
mailing list