On Sun, Apr 19, 2009 at 07:57:20AM -0700, Eliot Miranda wrote:
On Sun, Apr 19, 2009 at 6:43 AM, Bryce Kampjes bryce@kampjes.demon.co.ukwrote:
On Sat, 2009-04-18 at 18:15 -0700, Eliot Miranda wrote:
Hi All,
I see that Float 32-bit word order is big-endian (PowerPC) on all
platforms. This is a pain for performance and a pain for code generation in Cog. For example using SSE2 instructions it is trivial to swizzle a PowerPC-layout Float into an xmm register using the PSHUFD SSE2 instruction but tediously verbose to swizzle on write, because one has to swizzle to an xmm register which is hence destructive, which means three instructions (shuffle, write, unshuffle) just to write a Float result. Yes, ok 2 extra instructions is small potatoes, but they're still starch. So I wonder what would the impact be of maintaining Floats in platform order? There are a number of possible solutions.
- Floats are always in platform order and swizzled on image load when
moving from little-endian to big-endian or vice verce. Image code must be rewritten to take the platform's endianness into account. (requires an image rewrite)
- As for 1 but the image is isolated from the change by providing
two primitives, primitiveFloatAt and primitiveFloatAtPut which are implemented with selectors at: basicAt: at:put: and basicAt:put: on Float. These primitives map index 1 onto the most significant word and index 2 onto the least significant word. (requires no image rewrite, but does require a file-in of the four implementations)
I'd like to see Floats stored in native format too. Don't forget about the 32 bit floats in Float arrays.
Tell me more :) Are these in some funky order, or are they just IEEE single precision in platform order?
The attached world.png is a screen shot of a 64-bit image running on an Intel box, with hex printouts of the contents of an IntegerArray and a FloatArray (note, OopPlugin is a utility that I use for accessing the internals of object memory slots in the real object memory). This shows the internal storage of float values in a FloatArray. I poked various values into the array so you can see where they are stored in the 64-bit object memory words.
The values in a FloatArray are 32-bit floats, packed into 64-bit slots in the object memory. There are no endian issues to worry about. On both 32-bit and 64-bit object memories, the values are arranged in the order of an (int *) access. In other words, they are arrays of 32-bit values that just happen to be stuffed onto slots that the object memory thinks are 64-bit words.
Of course, storage of 32-bit floats in FloatArray is unrelated to the original topic of Float swizzling.
(BTW, is reverseWordsFrom:to: broken for 64-bit images?)
As far as I know, there are no problems with this. The original 64-bit image was done on a big-endian box, and decendants of that image are running on my little-endian box today, so #reverseWordsFrom:to: must have worked.
Dave