Multy-core CPUs

Robert Withers reefedjib at yahoo.com
Thu Oct 18 16:06:20 UTC 2007


On Oct 17, 2007, at 11:12 PM, Hans-Martin Mosner wrote:

> This would only make things more complicated since then the primitives
> would have to start parallel native threads working on the same object
> memory.
> The problem with native threads is that the current object memory  
> is not
> designed to work with multiple independent mutator threads. There  
> are GC
> algorithms which work with parallel threads, but AFAIK they all have
> quite some overhead relative to the single-thread situation.
>
> IMO, a combination of native threads and green threads would be the  
> best
> (although it still has the problem of parallel GC):
> The VM runs a small fixed number of native threads (default: number of
> available cores, but could be a little more to efficiently handle
> blocking calls to external functions) which compete for the runnable
> Smalltalk processes. That way, a number of processes could be  
> active at
> any one time instead of just one. The synchronization overhead in the
> process-switching primitives should be negligible compared to the
> overhead needed for GC synchronization.

This is exactly what I have started work on.  I want to use the  
foundations of SqueakElib as a msg passing mechanism between objects  
assigned to different native threads.  There would be one native  
thread per core.  I am currently trying to understand what to do with  
all of the global variables used in the interp loop, so I can have  
multiple threads running that code.  I have given very little thought  
to what would need to be protected in the object memory or in the  
primitives.  I take this very much as a learning project.  Just  
think, I'll be able to see how the interpreter works, the object  
memory, bytecode dispatch, primitives....all of it in fact.  If I can  
come out with a working system that does msg passing, even at the  
cost of poorly performing object memory, et al., then it will be a  
major success for me.

It is going to be slower, anyway, because I have to intercept each  
msg send as a possible non-local send.  To this end, the Macro  
Transforms had to be disabled so I could intercept them.  The system  
slowed considerably.  I hope to speed them up with runtime info: is  
the receiver in the same thread that's running?

I do appreciate your comments and know that I may be wasting my  
time.  :)

>
> The simple yet efficient ObjectMemory of current Squeak can not be  
> used
> with parallel threads (at least not without significant  
> synchronization
> overhead). AFAIK, efficient algorithms require every thread to have  
> its
> own object allocation area to avoid contention on object allocations.
> Tenuring (making young objects old) and storing new objects into old
> objects (remembered table) require synchronization. In other words,
> grafting a threadsafe object memory onto Squeak would be a major  
> project.
>
> In contrast, for a significant subset of applications (servers) it is
> orders of magnitudes simpler to run several images in parallel. Those
> images don't stomp on each other's object memory, so there is  
> absolutely
> no synchronization overhead. For stateful sessions, a front end can
> handle routing requests to the image which currently holds a session's
> state, stateless requests can be handled by any image.
>
> Cheers,
> Hans-Martin
>




More information about the Squeak-dev mailing list