Re: Why the compiler applies sign extension to unsigned data?



Alberto wrote:
I may be wrong, but if I say

void * p; uint64 p_address = (uint64) p;

I want the value of p to be made unsigned *before* the cast. I do not
want the cast to generate a sign extension.

Unfortunately, whenever you do

smalltype vsmall;
bigtype vbig = (bigtype)vsmall;

C compilers first resize vsmall to the size of bigtype (using the
signedness of smalltype), THEN switch to the signedness of bigtype.

The workaround (as you do yourself below) is to first cast to the
desired signedness but unchanged size, then to the final type.

signed smalltype vsmall;
unsigned bigtype vbig = (unsigned bigtype)(unsigned smalltype)vsmall;
/* force zero extension */

unsigned smalltype vsmall;
signed bigtype vbig = (signed bigtype)(signed smalltype)vsmall;
/* force sign extension */


Max pointed to one solution, although I don't immediately see why an
ULONG_PTR doesn't sign-propagate on a cast, while a void* or long*
might. I would rather use a macro like this:

ULONG_PTR is defined (in the documentation) as being the unsigned type,
which is the same size as a PTR, but as similar to ULONG as this
permits. Thus if a pointer is the same size as ULONG (32 bits), then
ULONG_PTR is ULONG and casting the pointer to ULONG_PTR first is the
same as the second case below. But if a pointer is the same size as
ULONGLONG (64 bits), then ULONG_PTR is ULONGLONG and casting to
ULONG_PTR first and then to ULONGLONG is a nop and thus the same as your
first case below. (See windef.h for the implementation details).
I hope that when that time comes, Win128 with 128 bit pointers will make
ULONG_PTR 128 bits.


#ifdef SIXTY_FOUR_BIT_POINTERS
#define CAST_FROM_POINTER(ptr_) ((uint64)(ptr_))
#else
#define CAST_FROM_POINTER(ptr_) ((uint64)(uint32)(ptr_))
#endif

Then, if p is a pointer, you can say:

uint64 value = CAST_FROM_POINTER(p);

You can define uint64 and uint32 appropriately. Note that I am relying
on the compiler and on the knowledge of the size of the ptr variable,
but I find this more acceptable than use a DDK type such as ULONG_PTR
whose semantics I don't control.
Well the ULONG_PTR semantics are controlled by the same team that
controls the semantics of everything else in that environment. And the
important thing here is that the team itself relies on those same
semantics for its own very big codebase, which gives you a nice chance
that they will keep it up to date with new architectures supported by
the system.

But if you want a non-Microsoft equivalent, size_t usually has the same
semantics, unless there are special architecture rules that arrays can
never cross some address multiples (it was like that on Win16 and other
x86 16 bit systems such as DOS and OS/2 1.x).


I'm getting used to declare all my pointers as uint64, both in 32-bit
and 64-bit code. When I need them to actually point to something, I
also use a CAST_TO_POINTER(type,value) macro:

#if SIXTY_FOUR_BIT_POINTERS
#define CAST_TO_POINTER(type_, var) ((type_ *)(var))
#else
#define CAST_TO_POINTER(type_, var) ((type_ *)(uint32)(var))
#endif

I can say, for example,

uint64 * p = CAST_TO_POINTER(uint64,p_value);
while (there_is_something_to_do) do_something(p++);

Or, more simply,

while (there_is_something_to_do) do_something(CAST_TO_POINTER
(uint64, p_value));

By using these macros, I keep the semantics of type casting firmly
under my control, and I restrict my 32 versus 64 bit issues to a very
small set of handwritten macros. Incidentally, these macros in their
official version also work under Linux and Solaris, which don't know
what ULONG_PTR is.


Allocating and processing 64 bit values on platforms where both the
values and the registers are only 32 bits large is wasteful.

A better solution would be to define your own type similar to ULONG_PTR
but available on all platforms.

Something like this (based on working code, but simplified for this post):

#if defined(__GNUC__) && defined(__GNUC_MINOR__)
#if __GNUC__ * 1000 + __GNUC_MINOR__ >= 2007
/* Proprietary GNU extension to tell the compiler
* backend to match the size and other general
* properties of pointer types
*/
#define INTPTR_DECORATE(typ_nam) typ_nam \
__attribute__((__mode__(__pointer__)))
#else
#define INTPTR_DECORATE(typ_nam) typ_nam
#endif
#elif defined(_MSC_VER) && !defined(_midl)
#if (_MSC_VER >= 1300) && defined(_X86_)
/* Proprietary MS extension to tell the 32 bit compiler
* to assume this will compile to a 64 bit type if
* recompiled with the 64 bit compiler. This enables
* warnings if mixed with types that remain 32 bits
* and suppresses warnings if mixed with types that
* become 64 bits.
* Neither warning is emitted unless compiling with the
* -Wp64 option.
*/
#define INTPTR_DECORATE(typ_nam) _w64 typ_nam
#else
#define INTPTR_DECORATE(typ_nam) typ_nam
#endif
#else
#define INTPTR_DECORATE(typ_nam) typ_nam
#endif

#if defined(_midl) && _midl > 501
/* IDL compiler/RPC stub generator which knows about
* handling ints whose size may vary between client
* and server in some cases.
*/
typedef unsigned __int3264 uintptr_t;
typedef __int3264 intptr_t;
#elif POINTER_BITS == 32
typedef INTPTR_DECORATE(uint32_t uintptr_t);
typedef INTPTR_DECORATE( int32_t intptr_t);
#elif POINTER_BITS == 64
typedef INTPTR_DECORATE(uint64_t uintptr_t);
typedef INTPTR_DECORATE( int64_t intptr_t);
#elif POINTER_BITS == 128
typedef INTPTR_DECORATE(uint128_t uintptr_t);
typedef INTPTR_DECORATE( int128_t intptr_t);
#else
/* Best guess, usually true */
typedef INTPTR_DECORATE(size_t uintptr_t);
typedef INTPTR_DECORATE(ptrdiff_t intptr_t);
#endif

/* If an assumption is wrong above, next lines will fail to compile
* complaining that the array size is negative.
*/
typedef int assertuintptr_t_size[
(sizeof(uintptr_t) == sizeof(void*)) ? 1 : -1];
typedef int assertintptr_t_size[
(sizeof( intptr_t) == sizeof(void*)) ? 1 : -1];

Alberto.



On May 19, 5:07 pm, zhongsheng <zhongsh...@xxxxxxxxxxxxxxxxxxxxxxxxx>
wrote:
Thanks for the information. using ULONG_PTR seems to work for me.

It would be nice for Microsoft to document this pointer casting related sign
extension behavior so that others won't have to go through my pain.

Thanks,
zhongsheng



"Jakob Bohm" wrote:
Maxim S. Shatskih wrote:
So, the pointers are considered signed. Probably this is documented in C language spec or in MS's docs. I'm not surprised.
The C language leaves this unspecified, so it is at the discretion of
the compiler vendor and may depend on the compiler version. The only
int types that can safely be cast to and from pointers are those whose
size are the same as the size of a pointer: (U)LONGLONG/(U)LONG_PTR on
Win64, (U)LONG/(U)LONG_PTR on Win32, (U)LONG on Win16 large mode,
(U)SHORT on Win16 small mode. Synonyms such as DWORD_PTR and size_t
are fine too.
Solution: be accurate and use (ULONG64)(ULONG_PTR)ptr. Fine across 32/64 bit builds, and will not have the issue with sign-extension on 32 bit, since on 32 bit this is (unsigned __in64)(unsigned long)ptr.
Yes.


--
Jakob Bøhm, M.Sc.Eng. * jb@xxxxxxxxxx * direct tel:+45-45-90-25-33
Netop Solutions A/S * Bregnerodvej 127 * DK-3460 Birkerod * DENMARK
http://www.netop.com * tel:+45-45-90-25-25 * fax:+45-45-90-25-26
Information in this mail is hasty, not binding and may not be right.
Information in this posting may not be the official position of Netop
Solutions A/S, only the personal opinions of the author.

.



Relevant Pages

  • Re: problem with memcpy and pointers/arrays confusion - again
    ... this second method is known as an explicit conversion, or cast. ... The cast, in effect, tells the compiler: ... the malloc function. ... function taking a size_t as a parameter and returning a void pointer (i.e. ...
    (comp.lang.c)
  • Re: Can I Trust Pointer Arithmetic In Re-Allocated Memory?
    ... If your compiler ... it works with a cast. ... Pointer arithmetic, as you probably know, is scaled by ... not sure about concerning realloc(). ...
    (comp.lang.c)
  • Re: Problem with malloc
    ... In C you need not cast void pointers. ... void pointer to any other pointer type. ... >> If your compiler pukes at that then you need a new ...
    (comp.lang.c)
  • Re: Another ANSI C question about volatile
    ... > The cast is on the address of x. ... > lvalue of type 'pointer to volatile int'. ... > volatile, ... > that a compiler, needing to reread xp each time, would then ...
    (comp.lang.c)
  • Re: question about cast
    ... I don't believe the sizes of the various types have anything to do ... In C, a cast is not allowed on the left side of an assignment, ... convert between pointer types and floating-point types. ... I would guess that the extension implemented by whatever compiler the ...
    (comp.lang.c)