[Pixman] [ssse3]Optimization for fetch_scanline_x8r8g8b8
Siarhei Siamashka
siarhei.siamashka at gmail.com
Sat Aug 21 10:48:51 PDT 2010
On Friday 20 August 2010 18:39:47 Liu, Xinyun wrote:
> Hi Siarhei Siamashka,
>
> Here is a new patch, can you review it? Thank you!
Sure, thanks for the updated patch. Some comments follow.
> From 9783651899a2763d7fcca2960fc354bd1f541980 Mon Sep 17 00:00:00 2001
> From: root <root at d501.localdomain>
A minor nitpick here. The 'From:' header seems to be wrong, it does not look
like a real patch author name.
> +dnl Check for SSSE3
> +
> +if test "x$SSSE3_CFLAGS" = "x" ; then
> + if test "x$SUNCC" = "xyes"; then
> + # SSSE3 is enabled by default in the Sun Studio 64-bit environment
Is it a valid statement with the regards to Sun Studio? Looks like it is a
direct copy/paste from SSE2. Can we be sure that there is nothing redundant or
just plain wrong here?
> +#if defined(__GNUC__) && (__GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ <
3))
> +# if !defined(__amd64__) && !defined(__x86_64__)
Is this 'defined(__amd64__)' check needed here? Looks like it is again a
copy/paste from SSE2 code, but SSE2 is guaranteed to be supported for x86-64
systems.
> +# error "Need GCC >= 4.3 for SSSE3 intrinsics on x86"
> +# endif
> +#endif
> diff --git a/pixman/pixman-access-ssse3_x86-64.S b/pixman/pixman-access-
ssse3_x86-64.S
> new file mode 100644
> index 0000000..fec93df
> --- /dev/null
> +++ b/pixman/pixman-access-ssse3_x86-64.S
Copyright notice seems to be still missing in the newly added source files.
> + cmp %dil, %sil
> + jb L(bk_write)
> + add %rdx, %rsi
> + add %rdx, %rdi
> + BRANCH_TO_JMPTBL_ENTRY (L(table_48bytes_fwd), %rdx, 4)
> +L(bk_write):
> + BRANCH_TO_JMPTBL_ENTRY (L(table_48_bytes_bwd), %rdx, 4)
Would it make sense to also purge this 'bk_write' and 'table_48_bytes_bwd'
stuff?
> diff --git a/pixman/pixman-cpu.c b/pixman/pixman-cpu.c
> index e4fb1e4..ab2d69a 100644
[...]
> +#else /* end dt_cpu32() */
> +
> +static unsigned int
> +detect_cpu_features (void)
> +{
> + unsigned int features = 0;
> + unsigned int result = 0;
> + unsigned int result_c = 0;
This looks like a duplication of the code to make a new 64-bit version of
'detect_cpu_features' function. Could we assume that all 64-bit processors
support CPUID instruction (so that the check for CPUID support can be omitted),
and also support MMX and SSE2? In this case SSSE3 check will be just a tiny
block of inline assembly using CPUID instruction.
> diff --git a/pixman/pixman-sse2.c b/pixman/pixman-sse2.c
> index 3dd7967..0cfd554 100644
> --- a/pixman/pixman-sse2.c
> +++ b/pixman/pixman-sse2.c
> @@ -3520,6 +3520,56 @@ sse2_composite_over_8888_n_8888
(pixman_implementation_t *imp,
> /*---------------------------------------------------------------------
> * composite_over_8888_n_8888
> */
> +#if defined(USE_SSSE3) && !defined(_MSC_VER)
> +
> +extern void *composite_src_x888_8888_ssse3_fast_path( uint32_t * dest,
uint32_t *src, int32_t count);
> +extern pixman_bool_t choose_ssse3_fast_patch;
> +static void
> +ssse3_composite_src_x888_8888 (pixman_implementation_t *imp,
The addition of 'ssse3_composite_src_x888_8888' into 'pixman-sse2.c' does not
look like a good idea. Adding a new 'pixman-ssse3.c' file for SSSE3
optimizations would have been a better choice.
Especially considering the following part of code:
> +#if defined(USE_SSSE3) && !defined(_MSC_VER)
> + PIXMAN_STD_FAST_PATH (SRC, x8r8g8b8, null, a8r8g8b8,
ssse3_composite_src_x888_8888),
> + PIXMAN_STD_FAST_PATH (SRC, x8b8g8r8, null, a8b8g8r8,
ssse3_composite_src_x888_8888),
> +#else
> PIXMAN_STD_FAST_PATH (SRC, x8r8g8b8, null, a8r8g8b8,
sse2_composite_src_x888_8888),
> PIXMAN_STD_FAST_PATH (SRC, x8b8g8r8, null, a8b8g8r8,
sse2_composite_src_x888_8888),
> +#endif
By using "#if defined(USE_SSSE3)" here, you effectively disable
'sse2_composite_src_x888_8888' fast path for the processors which do not have
SSSE3 support. If the runtime detection of SSSE3 fails, the code fallbacks to C
implementation and we have no chance to benefit from SSE2 optimizations.
--
Best regards,
Siarhei Siamashka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20100821/d9495579/attachment.pgp>
More information about the Pixman
mailing list