Why so few garage processors?

Alan Grimes alangrimes at starpower.net
Wed Mar 19 22:58:54 UTC 2003


om

/me was out looking for free scheatics/PCB CAD software and couldn't
find any that was worth downloading (Ie: could be expected to compile
and produce useful results) -- for linux at least.


As for Squeak hardware, I am very much in favor of a new Personal
Computer platform. If you think you have a potential business plan, I'll
help you draft it and definitely sign on. 


While multithreaded/multicore hardware is cool, the big issue is taht
squeak can't deal with it. I, right now, have a very large machine
reserved for squeak that is, most lamentably, collecting dust at the
moment. =\

I wrote to this list earlier about the issues of getting Leenooks Linear
Framebuffer support working, in this posting I will talk more about the
issues raised by supporting the dual-CPU SMP machine...

om

The current paradigm for using SMP hardware is to write multithreaded
applications. In this case we are interested in refactoring the VM so
that it can launch multiple paralell execution engines. While I lack the
skills to h4x0r the VM at this time, I have identified, with the
assistance of Ian the Great, the following changes to the standard
Squeak 3.4 VM that are required for this project and a few ideas about
how they can be implemented... 

1. ENHANCED SQUEAK >> C COMPILER:
	For the benefit of G3/G4 users, the raw output of the squeak >> C
compiler has been enhanced by adding encapsulating all local variables
in a struct FOO. so that the compiler may use a base-offset addressing
which reportedly provides significant speedups on that hardware. 
	Since the interpriter object will need to be instantiated for each
proc, the behavior of encapsulating all instance variables in structs
will need to be standardized.

2. REFACTORED INTERPRITER: 
	The current interpriter attempts to be a "do-all" VM It contains many
instance variables that, in a multithreaded system, should be global to
all "execution engines".
	I propose that the interpriter be refactored into a VM system and an
"EE" system for which each VM may have many EE's. 
	Furthermore, in the interpriter, there are many variables that are used
as indexing constants. It would seem that a better way to deal with
these is to write these as seperate objects and use the standard
collection classes... To make them compatable, the behavior of the
SQUEAK >> C compiler will need to be standardized so that the binary
footprint, while hidden, remains compatable. 
	I would be happy to work on these presonally but first I will require a
totorial on how to use the features of the refactoring browser... 

-------[We now have a first approximation of a multithreaded VM]-----

3. VM LEVEL RACE CONDITIONS 
	At this point, we need to address any race-conditions that may be in
the VM. Anything that may be called by two VM threads at once must be
written in a thread-safe manner. This includes the VM object and any of
the plugins... To make this work, a combination of changes to objects
and the Squeak >> C and perhaps even the Squeak >> bytecode compilers
might need to be implemented to generate thread-safe code. 
	Race conditions are a bitch to squeeze out. A person working on this
would need a truly stagering understanding of the Squeak system as well
as paralell programming and synchronization. 

-------- [ Changes to the VM ] -----------------------


4. MANY:MANY SCHEDULING.
	Squeak's internal scheduler and perhaps the process browser as well
will need to be adjusted to implement a MANY:MANY Sheduler which will
guarentee that a given process is assigned to no more than one processor
(thread) and blocking is properly implemented as priority alone is
nolonger sufficient to guarentee exclusive execution. 

5. Race conditions within the VM will need to be addressed. [ Ian tells
me there are some...] 

6. BENCHMARKING AND PERFORMANCE MONITORING...
	Changes to process accounting systems may be necessary.
	"n tinyBenchmarks" should be enhanced with multithreading so that N is
the number of threads. For example (on a 2 processor system):

$ 0 tinyBenchmarks.
0 threads produced: 0 bytecodes/sec,  0 sends/sec.

$ 1 tinyBenchmarks. 
1 threads produced: 1,000,000 bytecodes/sec, 100,000 sends/sec

$ 2 tinyBenchmarks.
2 threads produced: 1,800,000 bytecodes/sec, 180,000 sends/sec

$ 3 tinyBenchmarks.
3 threads produced: 1,750,000 bytecodes/sec, 175,000 sends/sec

-----------------------------------------------------

om 

I hope this posting may be of some use to people in improving the
performance of squeak on SMP hardware.

-- 
Karl Marx is smiling.
http://users.rcn.com/alangrimes/



More information about the Squeak-dev mailing list