[cairo] New ARMv7-A (NEON) optimisations for Pixman

Jeff Muizelaar jeff at infidigm.net
Thu May 7 08:15:49 PDT 2009


On Thu, May 07, 2009 at 11:34:22AM +0000, Jonathan Morton wrote:
> One removes the #ifdef magic kludge from Ian's code, and replaces it
> with the autoconf test from my previous patch.  Just a bit of cleanup.
> 
> The other adds some basic NEON blitters for RGB565 framebuffers,
> covering SRC RGB565 and SRC xRGB8888.  On our test hardware they get
> very close to maximum memory bandwidth.
> 
> -- 
> ------
> From: Jonathan Morton
>       jonathan.morton at movial.com
> 

> >From d972ecbb7c8c19589cbd133a7b2a3a900fa0856c Mon Sep 17 00:00:00 2001
> From: Jonathan Morton <jmorton at sd070.hel.movial.fi>
> Date: Thu, 7 May 2009 11:54:15 +0300
> Subject: [PATCH] Test USE_GCC_INLINE_ASM instead of USE_NEON_INLINE_ASM.  The former is now Autoconf enabled, and does what it says on the tin.

Pushed.

> >From f2a9ed3645013b6a95b92887f7a0577fd151f23d Mon Sep 17 00:00:00 2001
> From: Jonathan Morton <jmorton at sd070.hel.movial.fi>
> Date: Thu, 7 May 2009 12:20:02 +0300
> Subject: [PATCH] Add some NEON blitters for 16-bit framebuffers.
> 
> ---
>  pixman/pixman-arm-neon.c |  237 +++++++++++++++++++++++++++++++++++++++++++++-
>  pixman/pixman-arm-neon.h |   30 ++++++
>  pixman/pixman-pict.c     |   12 +++
>  pixman/pixman-utils.c    |    1 +
>  4 files changed, 279 insertions(+), 1 deletions(-)
> 
> diff --git a/pixman/pixman-arm-neon.c b/pixman/pixman-arm-neon.c
> index 51f7d55..3517d2d 100644
> --- a/pixman/pixman-arm-neon.c
> +++ b/pixman/pixman-arm-neon.c
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright © 2009 ARM Ltd
> + * Copyright © 2009 ARM Ltd, Movial Creative Technologies Oy
>   *
>   * Permission to use, copy, modify, distribute, and sell this software and its
>   * documentation for any purpose is hereby granted without fee, provided that
> @@ -21,6 +21,7 @@
>   * SOFTWARE.
>   *
>   * Author:  Ian Rickards (ian.rickards at arm.com) 
> + * Author:  Jonathan Morton (jonathan.morton at movial.com)
>   *
>   */
>  
> @@ -31,6 +32,9 @@
>  #include "pixman-arm-neon.h"
>  
>  #include <arm_neon.h>
> +#include <string.h>
> +#include <stdio.h>
> +#include <assert.h>

I don't think these headers are needed.

>  
>  static force_inline uint8x8x4_t unpack0565(uint16x8_t rgb)
> @@ -1376,3 +1380,234 @@ fbCompositeSrcAdd_8888x8x8neon (pixman_op_t op,
>      }
>  }
>  
> +#ifdef USE_GCC_INLINE_ASM
> +
> +void
> +fbCompositeSrc_16x16neon (
> +	pixman_op_t op,
> +	pixman_image_t * pSrc,
> +	pixman_image_t * pMask,
> +	pixman_image_t * pDst,
> +	int16_t      xSrc,
> +	int16_t      ySrc,
> +	int16_t      xMask,
> +	int16_t      yMask,
> +	int16_t      xDst,
> +	int16_t      yDst,
> +	uint16_t     width,
> +	uint16_t     height)
> +{
> +	uint16_t    *dstLine, *srcLine;
> +	uint32_t     dstStride, srcStride;
> +
> +	if(!height || !width)
> +		return;
> +
> +	/* We simply copy 16-bit-aligned pixels from one place to another. */
> +	fbComposeGetStart (pSrc, xSrc, ySrc, uint16_t, srcStride, srcLine, 1);
> +	fbComposeGetStart (pDst, xDst, yDst, uint16_t, dstStride, dstLine, 1);
> +
> +	/* Preload the first input scanline */
> +	{
> +		uint16_t *srcPtr = srcLine;
> +		uint32_t count = width;
> +
> +		asm volatile (
> +		"0: @ loop							\n"
> +		"	SUBS    %[count], %[count], #32				\n"
> +		"	PLD     [%[src]]					\n"
> +		"	ADD     %[src], %[src], #64				\n"
> +		"	BGT 0b							\n"
> +

I think would be better if you used lowercase assembler nmemonics to be
consistent with the rest of the file.

-Jeff


More information about the cairo mailing list