2016-01-08 18:31 GMT+01:00 Eliot Miranda <eliot.miranda@gmail.com>:

Hi Ben,

On Thu, Jan 7, 2016 at 4:40 PM, Ben Coman <btc@openinworld.com> wrote:
On Fri, Jan 8, 2016 at 2:51 AM, Eliot Miranda <eliot.miranda@gmail.com> wrote:
> Hi Ben,
>
> On Thu, Jan 7, 2016 at 10:39 AM, Ben Coman <btc@openinworld.com> wrote:
>>
>>
>> On Fri, Jan 8, 2016 at 1:20 AM, Eliot Miranda <eliot.miranda@gmail.com>
>> wrote:
>> >
>> > and here's a version with a better class comment
>> >
>> > On Thu, Jan 7, 2016 at 9:12 AM, Eliot Miranda <eliot.miranda@gmail.com>
>> > wrote:
>> >>
>> >> Hi Denis, Hi Clément, Hi Frank,
>> >>
>> >> On Thu, Jan 7, 2016 at 5:34 AM, Clément Bera <bera.clement@gmail.com>
>> >> wrote:
>> >>>
>> >>> Hello,
>> >>>
>> >>> Eliot, please, you told me you had the code and Denis is interested.
>> >>>
>> >>> It uses 3 primitives for performance.
>> >>
>> >>
>> >> Forgive the delay. I thought it proper to ask permission since the
>> >> code was written while I was at Qwaq. I'm attaching the code in a fairly raw
>> >> state, see the attached. The code is MIT, but copyright 3DICC.
>> >>
>> >> It is a plugin replacement for Squeak's Mutex, and with a little
>> >> ingenuity could be a replacement for Squeak's Monitor. It is quicker
>> >> because it uses three new primitives to manage entering a critical section
>> >> and setting the owner, exiting the critical section and releasing the owner,
>> >> and testing if a critical section, entering if the section is unowned. The
>> >> use of the primitives means fewer block activations and ensure: blocks in
>> >> entering and exiting the critical section, and that's the actual cause of
>> >> the speed-up.
>> >>
>> >> You can benchmark the code as is. Here are some results on 32-bit
>> >> Spur, on my 2.2GHz Core i7
>> >>
>> >> {Mutex new. Monitor new. CriticalSection new} collect:
>> >> [:cs| | n |
>> >> n := 0.
>> >> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n
>> >> := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
>> >> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n
>> >> := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
>> >> n ] bench]
>> >>
>> >> {Mutex new. Monitor new. CriticalSection new} collect:
>> >> [:cs| | n |
>> >> n := 0.
>> >> cs class name, ' -> ',
>> >> [cs critical: [n := n + 1]. cs critical: [n := n + 1]. cs critical: [n
>> >> := n + 1]. cs critical: [n := n + 1]. cs critical: [n := n + 1].
>> >> cs critical: [n := n - 1]. cs critical: [n := n - 1]. cs critical: [n
>> >> := n - 1]. cs critical: [n := n - 1]. cs critical: [n := n - 1].
>> >> n ] bench]
>> >>
>> >> #( 'Mutex -> 440,000 per second. 2.27 microseconds per run.'
>> >> 'Monitor -> 688,000 per second. 1.45 microseconds per run.'
>> >> 'CriticalSection -> 1,110,000 per second. 900 nanoseconds per run.')
>> >>
>>
>> This is great Eliot. Thank you and 3DICC. After loading the changeset
>> into Pharo-50515 (32 bit Spur) I get the following results on my
>> laptop i5-2520M @ 2.50GHz
>>
>> #('Mutex -> 254,047 per second'
>> 'Monitor -> 450,442 per second'
>> 'CriticalSection -> 683,393 per second')
>>
>> In a fresh Image "Mutex allInstances basicInspect" lists just two
>> mutexes...
>> 1. NetNameResolver-->ResolverMutex
>> 2. ThreadSafeTranscript-->accessSemaphore
>
>
> I hate myself for getting distracted but I'm finding this is un. One can
> migrate to the new representation using normal Monticello loads by
>
> In the first version redefine Mutex and Monitor to subclass LinkedList and
> have their owner/ownerProcess inst var first (actually third after firstLink
> & lastLink), and add the primitives.
>
> In the next version check that all Mutex and Monitor instanes are unowned
> and then redefine to discard excess inst vars
>
> Let me test this before committing, and see that all tests are ok.

Should Mutex and Monitor both directly subclass LinkedList and
duplicate the primitives in each?

Or should they both subclass CriticalSection which subclasses
LinkedList so the primitives are only defined once?

That's a good idea. Feel free to change the code, but test that the Monticello load handles this case properly first :-). Actually, given that the default state of all the Mutex and Monitor instances in the image is unowned (owner process is nil) then it'll just work anyway. If we do that, we must make sure to include the ICC copyright in CriticalSection's class comment, and can eliminate it from the primitives.

What effect would using the primitives from the superclass have on
performance? If any, I'd vote to optimise for duplication rather than
"nice" design, but our comments should document this.

Likely in the noise. The inline cacheing machinery in the VM is far cheaper than the real overheads here which are in block creation, process switch, interpreter primitive invocation.

cheers -ben

--
_,,,^..^,,,_
best, Eliot