[squeak-dev] The "correct" approach to multi-core systems.
Joshua Gargus
schwa at fastmail.us
Sat Feb 23 09:03:31 UTC 2008
On Feb 22, 2008, at 11:51 PM, Michael van der Gulik wrote:
>
>
> On 2/23/08, Joshua Gargus <schwa at fastmail.us> wrote:
> On Feb 22, 2008, at 7:01 PM, Michael van der Gulik wrote:
>
>>
>>
>
>
>> this makes sharing objects and synchronising access while still
>> getting good performance more difficult. I can't back up my claims
>> yet; we'll see how Hydra VM works out.
>>
>> In the long term, a VM that can run its green threads (aka Process)
>> on multiple OS threads (aka pthreads) should be the long-term goal.
>
> This is debatable. Why are you convinced that fine-grained
> concurrency will not involve a large performance hit due to CPU
> cache invalidations? I haven't heard a compelling argument that
> this won't be a problem (and increasingly so, as the number of cores
> grows). We can't pretend that it takes zero time to make an object
> available for processing on a different core. As I've said before,
> I'm willing to be convinced otherwise.
>
>
>
>
> Equally so, why then would any other concurrent implementation, such
> as the HydraVM, not also have exactly the same problem.
Because within HydraVM, each VM has it's own ObjectMemory in a single,
contiguous chunk of memory.
Below, you mention processor-affinity. This is certainly necessary,
but is orthogonal to the issue. Let's simplify the discussion by
assuming that the number of VMs is <= the number of cores, and that
each VM is pinned to a different core.
32-bit CPU caches typically work on 4KB pages of memory. You can fit
quite a few objects in 4KB. The problem is that is processor A and
processor B are operating in the same ObjectMemory, they don't have to
even touch the same object to cause cache contention... they merely
have to touch objects on the same memory page. Can you provide a
formal characterization of worst-case and average-case performance
under a variety of application profiles? I wouldn't know where to
start.
Happily, HydraVM doesn't have to worry about this, because each thread
operates on a separate ObjectMemory.
> Or why would any other concurrent application not have this problem?
They can, depending on the memory access patterns of the application.
>
>
> Real operating systems implement some form of processor affinity[1]
> to keep cache on a single processor. The same could be done for the
> Squeak scheduler. I'm sure that the scheduling algorithm could be
> tuned to minimize cache invalidations.
As I described above, the problem is not simply ensuring that each
thread tends to run on the same processor. I believe that you're
overlooking a crucial aspect of real-world processor-affinity schemes:
when a Real Operating System pins a process to a particular
processor, the memory for that process is only touched by that
processor.
I haven't had a chance to take more than a glance at it, but Ulrich
Draper from Red Hat has written a paper named "What Every Programmer
Should Know About Memory". It's dauntingly comprehensive. (What
Every Programmer Should Know About Memory)
It might help to think of a multi-core chip as a set of separate
computers connected by a network (I don't have the reference off-hand,
but I've seen an Intel whitepaper that explicitly takes this
viewpoint). It's expensive and slow to send messages over the network
to ensure that my cached version of an object isn't stale. In
general, it's better to structure our computation so that we know
exactly when memory needs to be touched by multiple processors.
Cheers,
Josh
>
>
> [1] http://en.wikipedia.org/wiki/Processor_affinity
>
> Gulik.
>
>
> --
> http://people.squeakfoundation.org/person/mikevdg
> http://gulik.pbwiki.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20080223/c1b9bdc0/attachment.htm
More information about the Squeak-dev
mailing list
|