Byte order and ByteArray conversions

Andrew C. Greenberg werdna at gate.net
Mon Nov 1 01:46:36 UTC 1999


>Further background: It seems to me that there should be no real conceptual
>difference between an IO channel to a file, and an IO channel to a socket,
>or to any other external IO channel which can be opened, closed, read,
>or written to. Presumably these should all look like streams, and presumably
>we would want socket streams, file streams, and any other streams on
>external IO channels to behave in similar ways. The specifics of making
>connections to various types of external IO channels probably merit their
>own classes, which would know about things like SQFile and SQSocket data
>structures if needed. I wrote some classes to do this for files and sockets,
>and it all seems to work very nicely, but I still need a machine independent
>way to convert the file and socket data structures into Smalltalk objects,
>hence the question about converting integers back and forth between
>ByteArrays and Integers.

Sounds a bit like Inferno, where every external object, file, 
printer, socket, network, scanner, other machine, etc. is a "file," 
manipulated using the traditional file operations of open, close, 
read, write, position, and each such object is named and accessed 
using an i-node-like hierarchical naming system.

If you know that the data is coming from that particular machine, you 
might simply write some pluggable primitive functions longAt:, 
shortAt: and the like, that do the obvious things, but by casting in 
the C code rather than trying to "interpret" the data.

This generates "portable" code that has the property that, when the 
plugin is compiled and installed on each native machine, it will load 
the data from the ByteArray, interpreting it with the "correct" byte 
gender for the machine on which it is running.  And, of course, once 
you have those operations, you can test your machine gender simply by 
loading in an asymmetric bytePattern, say, 16rEAFF and checking the 
value of a shortAt:

The code is portable in the sense that it will compile on each 
machine; it will run with the machine's own native data correctly, 
but is inherently machine dependent, in that images retaining this 
information may break when copied from machine to machine.

Thus, the code should only be used for machine-dependent purposes or 
for interpreting highly transient machine local data that will 
ultimately be stored in non-byteArray (or machine-indepent byteArray) 
data objects.  This is why, I believe, Squeak Central left only the 
"machine-dependent" operations in ByteArray -- the code yields the 
same results on every machine everywhere, even though its internals 
might operate differently.

I have a project that I have been working on and off for awhile 
trying to model native machine MemoryBlocks, to facilitate 
machine-dependent manipulation of data, and the coercion/marshalling 
of data from Squeak to OS and back.  The project entailed a hierarchy 
of classes that would seamlessly understand a block that was stored 
in a byteArray, stored in system RAM, and pointed to (as pointer, 
handle or otherwise) from a MemoryBlock, whether internal or 
external.  Using the decorator pattern, the project allows you to 
define system C records and other datatypes, and refer to them with a 
nice smalltalk syntax, regardless of whether the C type is stored in 
a byteArray, in Ram or otherwise.  For various reasons, the 
machine-dependent byteAt:, shortAt, etc. was useful.

Lately, I've been wondering how good such a thing is in principle -- 
one of the reasons that Java never made the cut of write-once, 
run-anywhere had to do with a yicky "easy" solution of using machine 
dependencies to solve problems.





More information about the Squeak-dev mailing list