[cairo] [PATCH] pixman: fast path for nearest neighbour scaled compositing operations.

Siarhei Siamashka siarhei.siamashka at nokia.com
Fri May 29 13:00:31 PDT 2009


On Thursday 28 May 2009 21:28:31 ext Jeff Muizelaar wrote:
> On Thu, May 28, 2009 at 05:05:55AM +0300, Siarhei Siamashka wrote:
> > Hello,
> >
> > This is more complete implementation of nearest neighbor scaling fastpath
> > functions. OVER 8888x8888, OVER 8888x0565, SRC 8888x8888,
> > SRC 8888x0565 and SRC 0565x0565 cases are supported.
>
> Great to see this work. Here are some quick comments:
> > +static void
> > fbCompositeTransformNearestNonrotatedAffineTrivialclipOver_8888x0565 ( + 
> >   pixman_image_t *pSrc, pixman_image_t *pDst, int xSrc, int ySrc, int
> > xDst, int yDst, +    int width, int height, int32_t vv0, int32_t vv1,
> > int32_t uv0, int32_t uv1) +{
>
> I think these fuctions would be easier to understand if the variables
> were renamed:
> vv0 -> vx;
> vv1 -> vy;
> uv0 -> unit_x;
> uv1 -> unit_y;
>
> I think x and y a are clearer then 0 and 1

done

> > +    uint16_t *dstLine;
> > +    uint32_t *xbits;
> > +    uint32_t  d;
> > +    uint32_t  s1, s2;
> > +    uint8_t   a1, a2;

> Overall more explanatory variable names would make understanding easier.
> Perhaps alpha1 instead of just a1?

Renamed:
src -> srcFirstLine
xbits -> src

The other variables are consistent with the rest of file (the same naming 
conventions exist in other functions). They are mostly temporary variables
which are really hard to misunderstand. If you have some specific suggestions
about how to better name these variables, it can be done easily.

> > +            if (a1 == 0xff)
> > +                WRITE(pDst, dst, cvt8888to0565(s1));
> > +            else if (s1) {
> > +                d = cvt0565to0888(READ(pDst, dst));
> > +                a1 ^= 0xff;
>
> things like this could also maybe use a comment or a macro:
> e.g. //equivalent to 255 - a1
> in fact is a1 ^= 0xff actually faster than a1 = 255 - a1?

It's just a single instruction on most (or even all?) cpu architectures, while
for 'a1 = 255 - a1' it might not be true (though ARM has RSB instruction).

> > +#define FbByteMulAdd_store_a8r8g8b8(x, a, y) do {                   \
> > +        uint32_t t = ((x & 0xff00ff) * a) + 0x800080;               \
> > +        t = (t + ((t >> 8) & 0xff00ff)) >> 8;                       \
> > +        t &= 0xff00ff;                                              \
> > +        t += y & 0xff00ff;                                          \
> > +        t |= 0x1000100 - ((t >> 8) & 0xff00ff);                     \
> > +        t &= 0xff00ff;                                              \
> > +                                                                    \
> > +        x = (((x >> 8) & 0xff00ff) * a) + 0x800080;                 \
> > +        x = (x + ((x >> 8) & 0xff00ff)) >> 8;                       \
> > +        x &= 0xff00ff;                                              \
> > +        x += (y >> 8) & 0xff00ff;                                   \
> > +        x |= 0x1000100 - ((x >> 8) & 0xff00ff);                     \
> > +        x &= 0xff00ff;                                              \
> > +        x <<= 8;                                                    \
> > +        x += t;                                                     \
> > +    } while (0)
>
> This looks like a copy of FbByteMulAdd

Changed the code to use FbByteMulAdd

Additionally:
1. Scaling test utility patch got more comments about its usage (btw, it would
be nice to have all the pixman tests well commented, just in order to know how
to interpret their results).
2. Removed the use of memcmp from the source clipping test patch. It is
potentially not quite safe as the compiler may pad the structure with
undefined data.

Overall, I don't quite like code duplication (many functions sharing the same
code), but alternatives may look a bit more obfuscated. Alternatives could use
either C preprocessor or perl script for emitting scaler functions from a
common template. In any case, the most important thing is that the code has
an extensive correctless test, so it is easy to transform code to any other
representation without too much risk of breaking it.

-- 
Best regards,
Siarhei Siamashka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Scaling-test-updated-to-provide-better-coverage-for.patch
Type: text/x-diff
Size: 7453 bytes
Desc: not available
Url : http://lists.cairographics.org/archives/cairo/attachments/20090529/ec256ed5/attachment-0003.patch 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Check-if-we-actually-have-source-clipping-set-for-im.patch
Type: text/x-diff
Size: 2103 bytes
Desc: not available
Url : http://lists.cairographics.org/archives/cairo/attachments/20090529/ec256ed5/attachment-0004.patch 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Fastpath-for-nearest-neighbour-scaled-compositing-op.patch
Type: text/x-diff
Size: 33109 bytes
Desc: not available
Url : http://lists.cairographics.org/archives/cairo/attachments/20090529/ec256ed5/attachment-0005.patch 


More information about the cairo mailing list