Ramblings on how to optimize Squeak for modern CPU bit manipulation (Was Re: I was wondering ...

Lawson English english at primenet.com
Tue Jun 29 18:38:54 UTC 1999


Further refinements to my pixel-handling primitives idea:

In addtion to the 128-bit AltiVec simulation/primitives, it would be useful
to implement:

A set of quick and dirty AlitiVec-like primitives that only operate on
single pixes at the various standard screen depths -32, 16, 8.

A full-blown single-pixel equivalent of the Altivec simulator that would
allow one to manipulate a single pixel's color channels
one-pixel-at-a-time.

E.g., C-based methods to separate color channels into 16 or 32-bit values,
and manipulate them simultaneously or separately, as well as methods to
convert them back to 8/5/whatever bits per channel and repack them into a
single pixel.

The idea is to create a bunch of primitives that can manipulate pixels
efficiently for color-handling, as well as to speedup  DSP-like operations.
The simulator (32 or 128-bit) would store intermediate values in virtual
registers so that no conversion or other data-related overhead would be
incurred until you needed to manipulate the data using standard SmallTalk.

The 128-byte simulator/primitives would be suitable for long streams of
pixels or other data. The 32-bit simulator/primitives would be suitable for
short segments of data (or for handling edge conditions in a long stream in
a machine that has an AltiVec-like device handy).

For MMX, perhaps an intermediate 64-bit version could be implemented as
well, or facilities created to handle 64-bit segments from within the
128-bit version?

Comments? Criticisms? I haven't benchmarked any of this, but my intutition
says that the time-savings from doing this could be quite good, especially
during the prototyping phase of pixel/DSP algorithms.

I said:

[refinements to my idea include]

>1) a set of quick and dirty primitives that can be called as standalones; 
>2) a full-blown (more or less) AltiVec simulator that uses a single set of
>32 "registers" that are accessed by index, rather than by providing a
>Smalltalk byteArray every time you use it.



----------------------------------------------------------------------
Use your imagination.
----------------------------------------------------------------------





More information about the Squeak-dev mailing list