Le mer. 28 oct. 2020 à 17:23, Eliot Miranda <eliot.miranda@gmail.com> a écrit :
 
Hi Tobi, Hi Levente,

> On Oct 28, 2020, at 7:06 AM, Tobias Pape <Das.Linux@gmx.de> wrote:
>
> 
> Hi
>
>>>>> On 27.10.2020, at 23:39, Nicolas Cellier <nicolas.cellier.aka.nice@gmail.com> wrote:
>> Hi Tim,
>> what is the result on that box with threaded heartbeat VM of:
>> [SUnitToolBuilderTests new testHandlingNotification] timeToRun.
>> If it is more than 5000 ms, then you would confirm the problem that I encounter.
>> This testHandlingNotification is repeating 10 loops with a 200ms delay wait.
>> The problem I've got with threaded heartbeat VM is that first 5 delays run normally, but next 5 will last for 1 second instead of 200ms...
>> This is probably what happens on the CI server too (the test times out and CI fails).
>
> Also note that the CI test builds both vms but uses the last one built (because it overwrites the first), and that happens to be the itimer one, not th ethreaded.
>
> The code above runs in around ~2000 ms on my machine (~2015, with ubuntu 18.04)
>
> Things that happened:
> - I just ran the test suite in the DEBUG itimer headful and headless variant and it passes.
> - I just ran the test suite in the DEBUG threaded headful and headless variant and it passes.
> - I ran the RELEASE itimer headful and headless variant and it passes
> - I ran the RELEASE threaded headless variant and it FAILED as on the CI
> - I ran the RELEASE threaded headful variant and it FAILED LESS
>     I mean: testHandlingNotification passed, and so did  testValueWithinTimingBasic and testValueWithinTimingNestedInner
>     but testValueWithinTimingNestedOuter testValueWithinTimingRepeat still fail!
>
> So there are discrepancies between
> debug and release    and
> headful and headless    (at least for threaded release)
>
> TL;DR: The linux x86_32 cog v3 threaded release vm has a timing problem ...
>
> Does that help anyone?

If you add code to extract the number of ioProcessEvents calls etc (see About Squeak VM parameters tab for the relevant info) does that tell us more?  IIRC one available vm parameter is the number of heartbeats.  So we should be able to see if it is the heartbeat itself that is failing or if it is further up stream.


Hi Eliot,
I instrumented the code like that:

testHandlingNotification
    | receivedSignal resumed |
    receivedSignal := resumed := false.
    [ | count |
    "client-code puts up progress, and signals some notications"
    Transcript cr; nextPutAll: 'test begins:'; print: Time utcMicrosecondClock \\ 3600000000.
    count := 0.
    'doing something'
        displayProgressFrom: 0
        to: 10
        during:
            [ : bar | 10 timesRepeat:
                [ bar value: (count := count + 1).
                Transcript cr; nextPutAll: 'count:'; print: count.
                Transcript cr; nextPutAll: 'before wait:'; print: Time utcMicrosecondClock \\ 3600000000.
                (Delay forMilliseconds: 200) wait.
                Transcript cr; nextPutAll: 'after wait:'; print: Time utcMicrosecondClock \\ 3600000000.
                Notification signal: 'message'.
                Transcript cr; nextPutAll: 'after notified:'; print: Time utcMicrosecondClock \\ 3600000000.
                Transcript cr; nextPutAll: '# ioProcessEvents: '; print: (Smalltalk vmParameterAt: 57).
                Transcript cr; nextPutAll: '# forceInterruptCheck: '; print: (Smalltalk vmParameterAt: 58).
                Transcript cr; nextPutAll: '# check event calls: '; print: (Smalltalk vmParameterAt: 59).
                resumed := true ] ] ]
        on: Notification
        do:
            [ : noti | receivedSignal := true.
            Transcript cr; nextPutAll: 'notified at:'; print: Time utcMicrosecondClock \\ 3600000000.
            noti resume ].
    Transcript endEntry.
    self
         assert: receivedSignal ;
         assert: resumed

The result I obtain with an updated trunk image and 64bits threaded VM:
build.linux64x64/squeak.cog.spur/build$ ./squeak ../../../image/trunk6-64.image

test begins:80631205
notified at:80631251
count:1
before wait:80633324
after wait:80834614
after notified:80834670
# ioProcessEvents: 94
# forceInterruptCheck: 3368
# check event calls: 815
count:2
before wait:80835008
after wait:81036398
after notified:81036454
# ioProcessEvents: 103
# forceInterruptCheck: 3459
# check event calls: 906
count:3
before wait:81036755
after wait:81238355
after notified:81238374
# ioProcessEvents: 113
# forceInterruptCheck: 3549
# check event calls: 996
count:4
before wait:81238633
after wait:81440889
after notified:81440909
# ioProcessEvents: 123
# forceInterruptCheck: 3639
# check event calls: 1086
count:5
before wait:81441151
after wait:82609021
after notified:82609053
# ioProcessEvents: 132
# forceInterruptCheck: 4119
# check event calls: 1161
count:6
before wait:82609520
after wait:83611465
after notified:83611497
# ioProcessEvents: 133
# forceInterruptCheck: 4536
# check event calls: 1162
count:7
before wait:83611974
after wait:84614429
after notified:84614462
# ioProcessEvents: 134
# forceInterruptCheck: 4961
# check event calls: 1165
count:8
before wait:84614926
after wait:85617377
after notified:85617404
# ioProcessEvents: 135
# forceInterruptCheck: 5368
# check event calls: 1166
count:9
before wait:85617840
after wait:86618396
after notified:86618428
# ioProcessEvents: 136
# forceInterruptCheck: 5776
# check event calls: 1167
count:10
before wait:86618896
after wait:87620232
after notified:87620291
# ioProcessEvents: 137
# forceInterruptCheck: 6196
# check event calls: 1168

Notice 10 ioProcessEvents and about 200ms wait on loop 1-5.
Then a single ioProcessEvent and about 1000ms wait on loops 6-10 (single check event call too).