[cairo] Problem compiling pixman-sse.c under Win32, blame it on MSVC !!
Vladimir Vukicevic
vladimir at pobox.com
Thu Aug 7 09:55:00 PDT 2008
Hey Andre,
Have you had a chance to take a look at this yet? I'd like to get the
speed boost on win32 at least, still have some stack alignment issues
to fix on our end for linux.
- Vlad
On Jul 21, 2008, at 6:48 PM, André Tupinambá wrote:
> Yes, It's a really possible, but I'll test it first and check the
> generated assembly (in GCC and MSVC).
>
> If we had some slowdowns or poor compiler optimization, I will try to
> change the inline functions to defines.
>
> Regards,
>
> André Tupinambá
>
>
> On Mon, Jul 21, 2008 at 10:05 PM, Rodrigo Kumpera
> <kumpera at gmail.com> wrote:
>> Since most of these functions force inlining, argument passing by
>> pointer or
>> by value
>> should not matter, modulo compiler inability to do the right thing.
>>
>> I think that there are no issues from changing some args from
>> values to regs
>> as long as GCC still manages to generate decent code. From what I
>> recall,
>> this is not
>> the case and some reasonable recent version of it should do the
>> trick.
>>
>> Frédéric, do you mind testing that for gcc?
>>
>> Thanks,
>> Rodrigo
>>
>> On Mon, Jul 21, 2008 at 6:19 AM, Antoine Azar
>> <cairo at antoineazar.com> wrote:
>>>
>>> Yep, I've tried everything on my side (playing with compiler
>>> switches,
>>> setting the function parameters as const references, wrapping in
>>> structs,
>>> changing stack size, etc) and didn't have more luck than Fred.
>>>
>>> VC++ always borks starting the fourth __m128i parameter only. The
>>> three
>>> first ones are always ok. It seems this is a limitation of VC++,
>>> that
>>> it'll
>>> only handle a "pass by register" of up to three SSE intrinsics,
>>> and not
>>> one
>>> more.
>>>
>>> I don't think sending pointers would be much of an issue: this is
>>> already
>>> done at many other places in the code, and it'll avoid the
>>> overhead of
>>> passing by value 16 byte data structures.
>>> Otherwise, we can try rewriting functions so they always use at
>>> most 3 SSE
>>> intrinsics that are passed by value, but that might be a pain.
>>>
>>> I'd be curious to see how the Intel Compiler handles this.
>>>
>>> Antoine
>>>
>>>
>>>> -----Original Message-----
>>>> From: cairo-bounces at cairographics.org
>>>> [mailto:cairo-bounces at cairographics.org] On Behalf Of Frédéric
>>>> Plourde
>>>> Sent: Sunday, July 20, 2008 10:03 PM
>>>> To: cairo at cairographics.org
>>>> Subject: [cairo] Problem compiling pixman-sse.c under
>>>> Win32,blame it on MSVC !!
>>>>
>>>> Hi !
>>>>
>>>> Vlad had emailled me with this problem compiling pixman
>>>> sith SSE2 stuff enabled.
>>>> In fact, here are the errors that we got :
>>>>
>>>> pixman-sse.c
>>>> pixman-sse.c(142) : error C2719: 'xmm3': formal parameter with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(195) : error C2719: 'alphaHi': formal parameter with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(210) : error C2719: 'alphaDstHi': formal parameter
>>>> with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(210) : error C2719: 'dstLo': formal parameter with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(210) : error C2719: 'dstHi': formal parameter with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(210) : error C2719: 'alphaSrcLo': formal parameter
>>>> with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(210) : error C2719: 'alphaSrcHi': formal parameter
>>>> with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(246) : error C2719: 'alphaHi': formal parameter with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(276) : error C2719: 'alphaHi': formal parameter with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(276) : error C2719: 'maskLo': formal parameter with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(276) : error C2719: 'maskHi': formal parameter with
>>>> __declspec(align('16')) won't be aligned
>>>> pixman-sse.c(290) : warning C4133: 'function' : incompatible
>>>> types - from '__m128i *' to 'const char *'
>>>> pixman-sse.c(296) : warning C4133: 'function' : incompatible
>>>> types - from '__m128i *' to 'const char *'
>>>> pixman-sse.c(371) : error C2719: 'alphaSrc': formal parameter with
>>>> __declspec(align('8')) won't be aligned
>>>> pixman-sse.c(398) : error C2719: 'dst': formal parameter with
>>>> __declspec(align('8')) won't be aligned
>>>> pixman-sse.c(4432) : error C2275: '__m128i' : illegal use of
>>>> this type as an expression
>>>> D:\msvs8\VC\INCLUDE\emmintrin.h(45) : see declaration
>>>> of '__m128i'
>>>> pixman-sse.c(4432) : error C2146: syntax error : missing ';'
>>>> before identifier 'xmm0'
>>>> pixman-sse.c(4432) : error C2065: 'xmm0' : undeclared identifier
>>>> pixman-sse.c(4432) : error C2065: 'xmm1' : undeclared identifier
>>>> pixman-sse.c(4432) : error C2065: 'xmm2' : undeclared identifier
>>>> pixman-sse.c(4432) : error C2065: 'xmm3' : undeclared identifier
>>>> pixman-sse.c(4434) : error C2440: '=' : cannot convert from
>>>> '__m128i' to 'int'
>>>> pixman-sse.c(4435) : error C2440: '=' : cannot convert from
>>>> '__m128i' to 'int'
>>>> pixman-sse.c(4436) : error C2440: '=' : cannot convert from
>>>> '__m128i' to 'int'
>>>> pixman-sse.c(4437) : error C2440: '=' : cannot convert from
>>>> '__m128i' to 'int'
>>>> pixman-sse.c(4439) : error C2440: 'function' : cannot convert
>>>> from 'int'
>>>> to '__m128i'
>>>> pixman-sse.c(4439) : warning C4024: 'save128Aligned' :
>>>> different types for formal and actual parameter 2
>>>> pixman-sse.c(4440) : error C2440: 'function' : cannot convert
>>>> from 'int'
>>>> to '__m128i'
>>>> pixman-sse.c(4440) : warning C4024: 'save128Aligned' :
>>>> different types for formal and actual parameter 2
>>>> pixman-sse.c(4441) : error C2440: 'function' : cannot convert
>>>> from 'int'
>>>> to '__m128i'
>>>> pixman-sse.c(4441) : warning C4024: 'save128Aligned' :
>>>> different types for formal and actual parameter 2
>>>> pixman-sse.c(4442) : error C2440: 'function' : cannot convert
>>>> from 'int'
>>>> to '__m128i'
>>>> pixman-sse.c(4442) : warning C4024: 'save128Aligned' :
>>>> different types for formal and actual parameter 2
>>>>
>>>>
>>>> We found the following :
>>>>
>>>>
>>>> for every C2719 errors, the solution is to avoid passing
>>>> __m128i values directly, because MSVC just doesn't allow
>>>> function parameters to be of types declared with the
>>>> "__declspec(align('#'))" modifier (this is the case of
>>>> __m128i). See:
>>>> http://msdn.microsoft.com/en-us/library/373ak2y1(VS.80).aspx
>>>>
>>>> More problematic is the behaviour of MSVC under these
>>>> assumptions... in fact, try removing ONLY the xmm3 function
>>>> parameter at line #142 in pixman-sse.c (and by commenting
>>>> references to xmm3 inside the function, of course), it will
>>>> compile. (!!). I've went through different forums for
>>>> developpers and it seems there are still some bugs with those
>>>> intrinsics in MSVC8. Very dissapointing!
>>>>
>>>> So my recommandation to misters Tupinamba and Kumpera would
>>>> be to pass POINTERS to those function parameters instead...
>>>> as :
>>>>
>>>> static inline __m128i
>>>> pack565_4x128_128 (__m128i *xmm0, __m128i *xmm1, __m128i
>>>> *xmm2, __m128i *xmm3) {
>>>> __m128i lo, hi;
>>>>
>>>> lo = _mm_packus_epi16 (pack565_2x128_128 ( *xmm0, *xmm1 ),
>>>> _mm_setzero_si128 ());
>>>> hi = _mm_packus_epi16 (_mm_setzero_si128 (),
>>>> pack565_2x128_128 ( *xmm2, *xmm3 ));
>>>>
>>>> return _mm_or_si128 (lo, hi);
>>>> }
>>>>
>>>> compiles perfectly.
>>>> However, this would maybe imply significant impacts on
>>>> performance... it should be addressed and profiled..
>>>>
>>>>
>>>> ******
>>>> Besides, for the C2275 error, it's even more embarassing for
>>>> microsoft... it's another case of MSVC being very capricious.
>>>> In some cases, with aligned variables, MSVC doesn't allow us
>>>> to declare
>>>> those variable in the middle of the function... We then need
>>>> to move the
>>>> declaration earlier.
>>>> see, at line 4432,
>>>>
>>>> __m128i xmm0, xmm1, xmm2, xmm3; needs to be moved up a
>>>> little... as :
>>>>
>>>> while (w >= 64)
>>>> {
>>>> __m128i xmm0, xmm1, xmm2, xmm3;
>>>>
>>>> /* 128 bytes ahead */
>>>> cachePrefetch (((__m128i*)s) + 8);
>>>> cachePrefetch (((__m128i*)d) + 8);
>>>>
>>>> compiles now...
>>>> ****************
>>>>
>>>> So here it is... a patch with those changes in pixman-sse.c
>>>> I'd like you guys (especcially André) to look that over and
>>>> tell me what
>>>> you think. It's a first draft, but it compiles under all 3
>>>> major platforms.
>>>>
>>>> Best,
>>>> -fred-
>>>>
>>>
>>> _______________________________________________
>>> cairo mailing list
>>> cairo at cairographics.org
>>> http://lists.cairographics.org/mailman/listinfo/cairo
>>
>>
>> _______________________________________________
>> cairo mailing list
>> cairo at cairographics.org
>> http://lists.cairographics.org/mailman/listinfo/cairo
>>
> _______________________________________________
> cairo mailing list
> cairo at cairographics.org
> http://lists.cairographics.org/mailman/listinfo/cairo
More information about the cairo
mailing list