[Hardware] Compiler optimizations on OO hardware
Klaus D. Witzel
klaus.witzel at cobss.com
Mon Jul 2 17:51:08 UTC 2007
On Mon, 02 Jul 2007 00:07:33 +0200, Matthew Fulmer wrote:
> My professor showed me this paper on compiler optimizations that
> can be applied to make asynchronous message passing fast on
> current multicore processors:
> I wonder exactly what hardware features could make this even
I know memory tags since the B5x00 and have used them,
In my experience memory tags make sense when the hardware knows about them
(of course software must be able to write them, rarely also read them).
What other hardware feature could make message passing faster: I think
that parallel, disjoint instruction streams per processor could be useful
(as an alternative to multicore processors).
Optimizing compilers attempt to generate code sequences which under "all
circumstances" do not stall the processor. Instead, another instruction
stream (which must not necessarily be releated to the "main" instruction
stream) could fill the gap (perhaps a stream with lower priority). This
would free architects and compilers from having to take into account every
detail about a processor's "optimum" instruction sequences and at the same
time reward every pair of concurrently running programs.
An example: while sending a message, another message could be received (or
be prepared) simultanously. Compare that to two processors of a multicore
which have no means for avoiding stalls (other than asking the architects
and compilers for better optimized code and cores).
> My first guess is tagging; by tagging I mean that some objects
> could be stored un-boxed, and so preventing an additional memory
> lookup to determine the class of the object. Tagging seems to
> mean different things to different people; what I mean is that
> every word would have a few extra bits that the CPU would
> generally ignore, and so were free to be used by the programmer
> any way they want. One use would be to use the extra bits to
> store some type information; for example, one could use one code
> to represent pointers, another for integers, another for future
> objects, etc. Some classes could be fully represented with just
> a one-word pointer, like Smalltalk does with SmallIntegers.
> What other ways could this be better?
More information about the Hardware