I bumped into an interesting article [1] by Rob Pike that... i = *((int*)data); #ifdef BIG_ENDIAN /* swap the bytes */ i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0); #endif
is the wrong way to handle endianness since... "It assumes integers are addressable at any byte offset; on some machines that's not true." "It depends on integers being 32 bits long, or requires more #ifdefs to pick a 32-bit integer type."
I guess some people here have byte-ordering war wounds so I'd value your opinion on whether I should file that as good advice. I notice we have a couple of instances of this...
$ find platforms -name "*.c" -exec grep -Hi "<<" {} ; | grep ">>" | grep 24 | grep 8 ./win32/plugins/FontPlugin/sqWin32FontPlugin.c: #define BYTE_SWAP(w) ((w << 24) | ((w & 0xFF00) << 8) | ((w >> 8) & 0xFF00) | (w >> 24)) ./win32/vm/sqWin32Window.c: #define BYTE_SWAP(w) w = (w<<24) | ((w&0xFF00)<<8) | ((w>>8)&0xFF00) | (w>>24)
cheers -ben
[1] https://commandcenter.blogspot.com.au/2012/04/byte-order-fallacy.html
Hi Ben, last year I added SQ_SWAP_4_BYTES and SQ_SWAP_8_BYTES macros in platforms/Cross/vm/sqMemoryAccess.h These macros use the compiler intrinsics if possible. https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/c0da5abcc7a538efd3a... So we should not define any new one and rather replace the suboptimal ones with the ones above.
i = *((int*)data); is doubly bad because of possible alignment problem as you told,but also because it introduces pointer aliasing, what the C compiler does not like, and the standards neither (C89 C99 ...).
However, our int are 32 bits long in all supported platforms. Would it not be the case, the VM would break in many places. For example, you can see the cost in term of number of commits for the fact that sizeof long < sizeof void * in LLP64 both in VMMaker and Opensmalltalk-vm source.
2017-06-13 6:13 GMT+02:00 Ben Coman btc@openinworld.com:
I bumped into an interesting article [1] by Rob Pike that... i = *((int*)data); #ifdef BIG_ENDIAN /* swap the bytes */ i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0); #endif
is the wrong way to handle endianness since... "It assumes integers are addressable at any byte offset; on some machines that's not true." "It depends on integers being 32 bits long, or requires more #ifdefs to pick a 32-bit integer type."
I guess some people here have byte-ordering war wounds so I'd value your opinion on whether I should file that as good advice. I notice we have a couple of instances of this...
$ find platforms -name "*.c" -exec grep -Hi "<<" {} ; | grep ">>" | grep 24 | grep 8 ./win32/plugins/FontPlugin/sqWin32FontPlugin.c: #define BYTE_SWAP(w) ((w << 24) | ((w & 0xFF00) << 8) | ((w >> 8) & 0xFF00) | (w >> 24)) ./win32/vm/sqWin32Window.c: #define BYTE_SWAP(w) w = (w<<24) | ((w&0xFF00)<<8) | ((w>>8)&0xFF00) | (w>>24)
cheers -ben
[1] https://commandcenter.blogspot.com.au/2012/04/byte-order-fallacy.html
cool. thx Nicolas. cheers -ben
On Tue, Jun 13, 2017 at 3:38 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
Hi Ben, last year I added SQ_SWAP_4_BYTES and SQ_SWAP_8_BYTES macros in platforms/Cross/vm/sqMemoryAccess.h These macros use the compiler intrinsics if possible. https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/ c0da5abcc7a538efd3ab6dcc4bb2d2ef3ac5c65a So we should not define any new one and rather replace the suboptimal ones with the ones above.
i = *((int*)data); is doubly bad because of possible alignment problem as you told,but also because it introduces pointer aliasing, what the C compiler does not like, and the standards neither (C89 C99 ...).
However, our int are 32 bits long in all supported platforms. Would it not be the case, the VM would break in many places. For example, you can see the cost in term of number of commits for the fact that sizeof long < sizeof void * in LLP64 both in VMMaker and Opensmalltalk-vm source.
2017-06-13 6:13 GMT+02:00 Ben Coman btc@openinworld.com:
I bumped into an interesting article [1] by Rob Pike that... i = *((int*)data); #ifdef BIG_ENDIAN /* swap the bytes */ i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0); #endif
is the wrong way to handle endianness since... "It assumes integers are addressable at any byte offset; on some machines that's not true." "It depends on integers being 32 bits long, or requires more #ifdefs to pick a 32-bit integer type."
I guess some people here have byte-ordering war wounds so I'd value your opinion on whether I should file that as good advice. I notice we have a couple of instances of this...
$ find platforms -name "*.c" -exec grep -Hi "<<" {} ; | grep ">>" | grep 24 | grep 8 ./win32/plugins/FontPlugin/sqWin32FontPlugin.c: #define BYTE_SWAP(w) ((w << 24) | ((w & 0xFF00) << 8) | ((w >> 8) & 0xFF00) | (w >> 24)) ./win32/vm/sqWin32Window.c: #define BYTE_SWAP(w) w = (w<<24) | ((w&0xFF00)<<8) | ((w>>8)&0xFF00) | (w>>24)
cheers -ben
[1] https://commandcenter.blogspot.com.au/2012/04/byte-order-fallacy.html
On Tuesday 13 June 2017 01:08 PM, Nicolas Cellier wrote:
last year I added SQ_SWAP_4_BYTES and SQ_SWAP_8_BYTES macros in platforms/Cross/vm/sqMemoryAccess.h These macros use the compiler intrinsics if possible. https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/c0da5abcc7a538efd3a... So we should not define any new one and rather replace the suboptimal ones with the ones above.
Why not use htonl family of functions? They are part of POSIX standards and widely available in <netinet/in.h>.
See the man pages for htonl and endian.
Regards .. Subbu
On 13.06.2017, at 20:28, K K Subbu kksubbu.ml@gmail.com wrote:
On Tuesday 13 June 2017 01:08 PM, Nicolas Cellier wrote:
last year I added SQ_SWAP_4_BYTES and SQ_SWAP_8_BYTES macros in platforms/Cross/vm/sqMemoryAccess.h These macros use the compiler intrinsics if possible. https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/c0da5abcc7a538efd3a... So we should not define any new one and rather replace the suboptimal ones with the ones above.
Why not use htonl family of functions? They are part of POSIX standards and widely available in <netinet/in.h>.
See the man pages for htonl and endian.
because "On machines which have a byte order which is the same as the network order, [htonl is] defined as null macro[]." which defies the generality of the swapping functions. Also, per posix there's only 16 and 32 bit versions of those, but we need 32 and 64 bit versions.
Best regards -Tobias
Regards .. Subbu
vm-dev@lists.squeakfoundation.org