[cairo] Transform optimization

André Tupinambá andrelrt at gmail.com
Fri Mar 6 23:04:09 PST 2009


Hi Soeren,

My turn for the vacations! the Brazilian's carnaval just ended, so I
came back to work :)

Answers below (and a new patch).

Regards

André Tupinambá

On Wed, Feb 11, 2009 at 9:11 PM, Soeren Sandmann <sandmann at daimi.au.dk> wrote:
> Hi André,
>
> Sorry about the extended absence. I finally got some time to read
> through the patch.
>
> Overall, I like the approach of doing this in the general scanline
> fetcher since that provides the benefit for all the operator, and the
> bilinearInterpolation functions make the existing bilinear code much
> more readable.
>
> Some general comments on the patch:
>
> * I would like to see comments that explain the various phases that
>  the code goes through. Just things like
>
>        /* Zero fill the part of the scanline that is to the left of
>         * the image */

Done, but could you please check the English?

>
>  Or alternatively, the phases could be made separate functions,
>  though this may complicate the code more than its worth.
>
> * Am I misreading the code, or is it missing a final zero-fill phase
>  for the pixels that lie to the right of the image?

Oops... Problem solved :)

>
> * Can the code that is computing the bilinear interpolation where one
>  of the columns is zero (the code for the left and right edge) be
>  factored out? Maybe it could even be reused in the general-purpose
>  bilinear code?

I separate the edge functions and the code became more readable, but I
can't use it in general bilinear fetch now. The main problem is to
identify where is the edge of the image in general code. Today the
check is in do_fetch code, we need to change it. I will think about
how I can do that.

>
>> BTW, I installed a 32bits linux (in my old mobile P4) and ran the the
>> cairo-perf with bilinearInterpolation optimized with uint64_t
>> variables. And my worries was confirmed, this version is slower than
>> the 32bits one. I don't know if keep this two versions is a problem,
>> looks good for me.
>
> It's fine to have separate 64 and 32 bit versions of the code, but I
> think we need a configure-time test for 64-bit here, so that the right
> code is used on other 64 bit architectures than x86-64.

Done, I also created another configure switch to disable the 64bits
optimization.

>
>
> Specific comments on the code:
>
>> +    if (!mask)
>> +    {
>> +        maskInc = 0;
>> +        maskBits = 1;
>> +        pMask = &maskBits;
>> +    }
>> +    else
>> +    {
>> +        maskInc = 1;
>> +        pMask = mask;
>> +    }
>
> I'm not completely convinced that the
>
>        if (*pMask & maskBits)
>           ...;
>
> is a net win, since *pMask is now always a memory reference, whereas
> "mask" could be register allocated.
>
> But in any case this idea seems to be unrelated to the rest of the
> patch, so I'd like to see it happen in a separate patch, and then, if
> it actually is a win, do it for convolutions and source pictures as
> well, not just for fetchTransformed().

Since the "mask" it's also a pointer, it could not be register
allocated all over the time. And yes, this code is faster than the old
one, but just a bit (something from 5% to 10%). It's too little to
make a big change in all code, I think.

>
>> +    if (vec0 < (int) 0xffff0000)
>> +    {
>> +        int points = ((0xffff0000 - vec0) / unit0) >> 16;
>
> We need a new #define pixman_fixed_minus_1 in pixman.h, which should
> then be used here.

Done.

>
>> +    if ((pTop == zero) &&
>> +        (pBottom == zero))
>> +    {
>> +        if (!mask)
>> +        {
>> +           memset (buffer, 0, width * sizeof(uint32_t));
>> +        }
>> +        else
>> +        {
>> +            for (i = 0; i<width; i++)
>> +            {
>> +                if (mask[i] & maskBits)
>> +                    buffer[i] = 0;
>> +            }
>> +        }
>> +
>> +        return;
>> +    }
>
> I think this should just be a memset(). There is no real point
> optimizing out memory writes, especially when the alternative involves
> the mask[i] memory reads.

Done.

>
>> +    pictWidth += 0x00010000;
>
> 0x00010000 should be pixman_fixed_1 here.
>

Done.

>>         }
>>      }
>> +    }
>>      else if (pict->common.filter == PIXMAN_FILTER_CONVOLUTION)
>>      {
>
> The code above this brace needs to be indented.
>

Done.

>
> Thanks,
> Soren
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pixman-transform.patch.bz2
Type: application/x-bzip2
Size: 4354 bytes
Desc: not available
Url : http://lists.cairographics.org/archives/cairo/attachments/20090307/42219523/attachment.bin 


More information about the cairo mailing list