Info on Smalltalk DSLs or Metaprogramming...

Michael Latta lattam at
Thu Aug 17 17:37:40 UTC 2006

Here is some recent experience in Java that has relevance to the multi-core

1) My system has 8 logical processors (4 cores with HyperThreading).
2) My first attempt a making a set of tasks concurrent got about a -10%
improvement.  With good profilers and several days of work I got that down
to a 2X speedup (using 8 cores).  Individual parts of the process are
getting 100% scalability, and others are still single threaded (which is
very common).

While it is tempting to think that multiple cores means one application will
run faster, in practice it is much harder to make happen, and FAR FAR harder
to debug due to the unpredictability of it.

You mention that you will be running multiple simulations.  Running each
simulation in a separate image is far more likely to produce good
scalability.  You can easily use various methods to move control objects
from one image to another to coordinate the simulations and migrate data
from one image to another at the beginning of a simulation.  I would pursue
that approach much more than trying to get 2 cores to run faster.  2 cores
is useful for multi-processing, but almost useless for multi-tasking as you
generally want a thread to supervise the other threads, which drops you back
down to 1 task oriented core.


> -----Original Message-----
> From: squeak-dev-bounces at [mailto:squeak-dev-
> bounces at] On Behalf Of Rich Warren
> Sent: Thursday, August 17, 2006 4:52 AM
> To: The general-purpose Squeak developers list
> Subject: Re: Info on Smalltalk DSLs or Metaprogramming...
> On Aug 17, 2006, at 12:52 AM, David T. Lewis wrote:
> > On Wed, Aug 16, 2006 at 11:36:33PM -1000, Rich Warren wrote:
> >>
> >> Ruby can more-easily create actual child processes, which might let
> >> me plow through the data faster on a dual-core machine (though, I
> >> believe this might lock it into only running on Unix boxes).
> >
> > Given that you are expecting to run on a unix box anyway, you can
> > do this quite easily in Squeak, see OSProcess and CommandShell on
> > SqueakMap. If you want to run multiple Squeak images on multiple
> > CPUs, look at #forkSqueak.
> Aha. Again, it's the low on the learning curve curse. I guess I
> shouldn't be surprised. Both use the same basic threading model. I
> should have suspected a similar OS process forking abilities.
> >
> > But I would be very surprised if you need this in practice. You can
> > manipulate really large amounts of data directly in Squeak, and
> > you'll be amazed how fast it can be.  Once you load your data into
> > Squeak, everything is in memory, and Squeak itself is quite fast.
> You're probably right, given that using both cores will give me a 2x
> boost, max (and that's being overly optimistic). So the speed boost
> you mention might largely compensate. Struggling to get it running on
> both cores feels like trying to eek out more speed by implementing it
> in highly profiled and optimized (and therefor non-portable) c code
> and assembly. I don't need to go down that road.
> I do have a few machines on my home network whose idle cycles I could
> borrow. That's where something like Rinda becomes more attractive.
> And, if I work out the bugs in my home network, then I could run it
> on the school's 32-processor cluster. Doing that should give me an
> order of magnitude boost in speed--which could make a big difference,
> depending on how many simulations we want to run (I'm guessing at
> least in the mid- to high- 100s), and how long each simulation will
> take.
> Regarding the DSL itself, I am currently planning on trying to
> implement it using just classes and methods. Using the config file
> approach, I'll read in a line at a time, and evaluate each line.  But
> I'll check out SmaCC, just in case I need the extra power.
> Thanks,
> -Rich-

More information about the Squeak-dev mailing list