Vats, Islands, and Collections: Finding cheap parallelism
Jecel Assumpcao Jr
jecel at merlintec.com
Thu Jan 24 19:14:56 UTC 2008
Sam Adams wrote on Thu, 24 Jan 2008 10:00:15 -0500:
> It would be interesting though to exchange ideas and approaches as you
> proceed in your research.
I am interested in this as well. My current work only uses 12 cores but
is based on the stuff I did for 64 node machines (and larger) back in
the early 1990s. Too bad I have to write all my stuff in Portuguese...
About Matthew's question, my suggestion is to implement Concurrent
Aggregates. These would have a local representative in each Vat that
would know about all of its "brothers". It would also store a subset of
the collection locally. So if you have a 12 thousand element
ConcurrentArray distributed among 128 Vats, then each part would hold
around 94 elements. A message sent to any of the local representatives
is repeated to all of the parts with the proper care to serialize
messages arriving at different parts at the same time.
On top of this you will want to build a programming model very similar
to APL (see FScript). The advantage of building the system in layers
like this is that neither hiding all the details nor exposing everything
work well for all applications. If you can select from different layers
as needed for your specific program then you will be tune things for the
best performance with the least code.
My rule of thumb is to roughly match the number of Vats and available
cores. A single Vat per core is not efficient because the whole core
will be idle whenever the software blocks, but too many Vats on a core
will cause a lot of switching overhead. My particular hardware design
supports eight user Vats and eight system ones on each core.
More information about the Squeak-dev