[Vm-dev] Ideas on cheap multi-threading for Squeak / Pharo ? (from Tim's article)

Ronie Salgado roniesalg at gmail.com
Tue Jan 31 18:08:51 UTC 2017


Hi All,

*Threads are more useful when one needs high performance and low latency in
> an application that runs in a single computer. High performance video games
> and (soft) realtime graphics are usually in this domain.*
>
> I know you're working with high performance video games. If you would
> introduce multi-threading in Squeak/Pharo, how would you do it ?
> Especially, do you have a design in mind that does not require to rewrite
> all the core libraries ?
>
For Pharo I am going to the GPGPU route, using either OpenCL or a low level
graphics API (Vulkan, D3D 12 or Metal). This way, I do not have to change
the VM or Pharo for using the many threads present in the GPGPU. I am
modifying Pharo and the VM with other purposes, such as being able to
submit lot of data to the GPU so that it can be kept busy.

For actual CPU side multithreading, I am abandoning Pharo and the VM by
making an Ahead of Time Compiler(something similar to Bee Smalltalk), where
I am using the OpalCompiler as a frontend, and a SSA based intermediate
representation which is very similar to the one offered by LLVM, but
written in Pharo. I had to make this SSA IR to be able to generate the
shaders for Vulkan from Pharo, so for this AoT I am just reusing it by
adding a machine code backend. With my framework I am able to generate an
elf32 or an elf64 that can be linked with any C library directly, such as a
minimalistic runtime( https://github.com/ronsaldo/slvm-native ) for
providing Smalltalk facilities such as message sends, object allocation,
GC, segmented stack, etc.

I have already gotten some things working like message sends, the segmented
stack, block closure creation and activation. For the object model, I am
using the Spur object model, but with some slight modifications. Object
interiors are aligned to 16 bytes, for being able to use SSE instructions.
There is a small preheader for implementing the LISP2 GC algorithm (I
choose it by its simplicity), become and heap management. The preheader is
not used by generated code, except for serializing objects in the object
file. I changed the CompiledMethod object type for having generic mixed oop
and native data objects. For GC and multithreading, I will be just stopping
the whole world in safe points and doing GC in a single thread. By
disabling the GC, the user could be scheduling the GC to happen in non user
perceived times, such as just after sending a frame rendering command.

AoT compilation of Smalltalk is going to make modifications to method
dictionaries a very rare operation, because you cannot AoT compile methods
on runtime time, so you do not need the compiler in a shipping application.
This places the burden of thread safetyness to a small number of places
that can be protected explicitly by using some Mutexes.

My plan with this infrastructure is leaving Pharo and the standard VM as a
game prototyping and development environment, but doing the actual
deployment by using this  very experimental Ahead of Time compiler, and the
minimalistic Smalltalk runtime.

Best regards,
Ronie

2017-01-31 12:57 GMT-03:00 Levente Uzonyi <leves at caesar.elte.hu>:

>
> On Tue, 31 Jan 2017, Stefan Marr wrote:
>
>
>> Hi Levente:
>>
>> On 31 Jan 2017, at 15:22, Levente Uzonyi <leves at caesar.elte.hu> wrote:
>>>
>>> Also the question is does it really need to be objects? Alternatives
>>>> include things like tuple spaces (think Linda), low-level shared memory
>>>> buffers (Python and others, and apparently ECMAScript 2017).
>>>>
>>>
>>> You'd actually share a segment with objects stored in it. Low-level
>>> buffers are very restricting. They force you to serialize objects if you
>>> want to keep using them. And that has some unwanted overhead.
>>>
>>
>> What’s a segment?
>>
>
> It's a read-only chunk of memory holding objects.
>
> Who controls the lifetime of it?
>>
>
> It's permanent.
>
> Are you doing local GC plus global reference counting?
>>
>
> GC never touches that memory, because it can't change.
>
> Somehow you’d still manage those objects, no?
>>
>
> No.
>
>
>>
>> If you go with objects, the problem is that you need to support GC. And,
>>>> I suppose Eliot will agree that GC for multithreaded systems isn’t exactly
>>>> zero cost.
>>>>
>>>
>>> You don't need multi-threaded GC here, just many independent
>>> single-threaded GCs, which we have already.
>>> Btw, this is the same thing Erlang does.
>>>
>>
>> I am probably missing something, but I’d think you need some global GC
>> mechanism. If you got shared objects, you need to coordinate the local GCs.
>>
>
> All shared objects are permanent and read-only.
>
> In Erlang, most messages are copied, only large data chunks are shared by
>> reference. So, that restricts the need for globally coordinated GC quite a
>> bit, but you still need it as far as I can tell.
>>
>
> Here objects shared by reference would be permanent, therefore no GC would
> be required.
>
> Levente
>
>
>> Best regards
>> Stefan
>>
>>
>> --
>> Stefan Marr
>> Johannes Kepler Universität Linz
>> http://stefan-marr.de/research/
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170131/aa8468e5/attachment-0001.html>


More information about the Vm-dev mailing list