Ramblings on how to optimize Squeak for modern CPU bit
manipulation (Was Re: I was wondering ...
Lawson English
english at primenet.com
Fri Jun 25 20:25:54 UTC 1999
Adam Hill <ahill at users.arco.com> said:
>There is a large list of libraries from Intel that are free for the
asking.
>We don't get source just binary redistribution.
>
>I realize this is not incredibly useful since it is X86 specific but it is
>an impressive list of libraries for some ideas(or API's):
[snipt library references]
I think that those might be worth looking at, but they are definitely
higher-level than what I was thinking of. I just want to emulate the
AltiVec API, which is sorta like MMX with 32 128-bit registers that can
also handle floating point operations.
I was thinking a little more about the AltiVec API issue, and I realized
that there are two needs here:
1) a set of quick and dirty primitives that can be called as standalones;
2) a full-blown (more or less) AltiVec simulator that uses a single set of
32 "registers" that are accessed by index, rather than by providing a
Smalltalk byteArray every time you use it.
That way, you only need to work with a simple index. Integer conversion
only need to happen during/before "load" and "store" operations. You also
could cascade AltiVec primitive calls to perform complex DPS/image
manipulations without having to go through any Squeak data interface. One
could provide pointers to registers and "register files" to allow efficient
virtual register swapping/loading.
One could manipulate scan-lines 16-bytes at a time, controlled by
Smalltalk. Add in a JIT Squeak compiler, and I suspect that a virtual
AltiVec algorithm could approach the speed of a reasonably efficient
C-based algorithm designed to do the same thing. On AltiVec machines, the
loop would go that much faster.
----------------------------------------------------------------------
Use your imagination.
----------------------------------------------------------------------
More information about the Squeak-dev
mailing list
|