Hi Tobi, Hi Levente,
On Oct 28, 2020, at 7:06 AM, Tobias Pape Das.Linux@gmx.de wrote:
Hi
On 27.10.2020, at 23:39, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
Hi Tim, what is the result on that box with threaded heartbeat VM of: [SUnitToolBuilderTests new testHandlingNotification] timeToRun. If it is more than 5000 ms, then you would confirm the problem that I encounter. This testHandlingNotification is repeating 10 loops with a 200ms delay wait. The problem I've got with threaded heartbeat VM is that first 5 delays run normally, but next 5 will last for 1 second instead of 200ms... This is probably what happens on the CI server too (the test times out and CI fails).
Also note that the CI test builds both vms but uses the last one built (because it overwrites the first), and that happens to be the itimer one, not th ethreaded.
The code above runs in around ~2000 ms on my machine (~2015, with ubuntu 18.04)
Things that happened:
- I just ran the test suite in the DEBUG itimer headful and headless variant and it passes.
- I just ran the test suite in the DEBUG threaded headful and headless variant and it passes.
- I ran the RELEASE itimer headful and headless variant and it passes
- I ran the RELEASE threaded headless variant and it FAILED as on the CI
- I ran the RELEASE threaded headful variant and it FAILED LESS I mean: testHandlingNotification passed, and so did testValueWithinTimingBasic and testValueWithinTimingNestedInner but testValueWithinTimingNestedOuter testValueWithinTimingRepeat still fail!
So there are discrepancies between debug and release and headful and headless (at least for threaded release)
TL;DR: The linux x86_32 cog v3 threaded release vm has a timing problem ...
Does that help anyone?
If you add code to extract the number of ioProcessEvents calls etc (see About Squeak VM parameters tab for the relevant info) does that tell us more? IIRC one available vm parameter is the number of heartbeats. So we should be able to see if it is the heartbeat itself that is failing or if it is further up stream.
Best regards -Tobias
BTW: Eliot, the VM spits out a "aioDisable: epoll_ctl: Bad file descriptor". Is that expected?
It’s not expected, but may be harmless. Levente, is this noise? Or is it that the input file descriptor is being shut down? Presumably this is just noise U.K. do with running headless.
On 28.10.2020, at 17:23, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Tobi, Hi Levente,
On Oct 28, 2020, at 7:06 AM, Tobias Pape Das.Linux@gmx.de wrote:
Hi
Hi Tim, what is the result on that box with threaded heartbeat VM of: [SUnitToolBuilderTests new testHandlingNotification] timeToRun. If it is more than 5000 ms, then you would confirm the problem that I encounter. This testHandlingNotification is repeating 10 loops with a 200ms delay wait. The problem I've got with threaded heartbeat VM is that first 5 delays run normally, but next 5 will last for 1 second instead of 200ms... This is probably what happens on the CI server too (the test times out and CI fails).
Also note that the CI test builds both vms but uses the last one built (because it overwrites the first), and that happens to be the itimer one, not th ethreaded.
The code above runs in around ~2000 ms on my machine (~2015, with ubuntu 18.04)
Things that happened:
- I just ran the test suite in the DEBUG itimer headful and headless variant and it passes.
- I just ran the test suite in the DEBUG threaded headful and headless variant and it passes.
- I ran the RELEASE itimer headful and headless variant and it passes
- I ran the RELEASE threaded headless variant and it FAILED as on the CI
- I ran the RELEASE threaded headful variant and it FAILED LESS I mean: testHandlingNotification passed, and so did testValueWithinTimingBasic and testValueWithinTimingNestedInner but testValueWithinTimingNestedOuter testValueWithinTimingRepeat still fail!
So there are discrepancies between debug and release and headful and headless (at least for threaded release)
TL;DR: The linux x86_32 cog v3 threaded release vm has a timing problem ...
Does that help anyone?
If you add code to extract the number of ioProcessEvents calls etc (see About Squeak VM parameters tab for the relevant info) does that tell us more? IIRC one available vm parameter is the number of heartbeats. So we should be able to see if it is the heartbeat itself that is failing or if it is further up stream.
ping me tomorrow, my linux machine is already switched off… -t
Best regards -Tobias
BTW: Eliot, the VM spits out a "aioDisable: epoll_ctl: Bad file descriptor". Is that expected?
It’s not expected, but may be harmless. Levente, is this noise? Or is it that the input file descriptor is being shut down? Presumably this is just noise U.K. do with running headless.
Le mer. 28 oct. 2020 à 17:23, Eliot Miranda eliot.miranda@gmail.com a écrit :
Hi Tobi, Hi Levente,
On Oct 28, 2020, at 7:06 AM, Tobias Pape Das.Linux@gmx.de wrote:
Hi
On 27.10.2020, at 23:39, Nicolas Cellier <
nicolas.cellier.aka.nice@gmail.com> wrote:
Hi Tim, what is the result on that box with threaded heartbeat VM of: [SUnitToolBuilderTests new testHandlingNotification] timeToRun. If it is more than 5000 ms, then you would confirm the problem that I
encounter.
This testHandlingNotification is repeating 10 loops with a 200ms delay
wait.
The problem I've got with threaded heartbeat VM is that first 5 delays
run normally, but next 5 will last for 1 second instead of 200ms...
This is probably what happens on the CI server too (the test times out
and CI fails).
Also note that the CI test builds both vms but uses the last one built
(because it overwrites the first), and that happens to be the itimer one, not th ethreaded.
The code above runs in around ~2000 ms on my machine (~2015, with ubuntu
18.04)
Things that happened:
- I just ran the test suite in the DEBUG itimer headful and headless
variant and it passes.
- I just ran the test suite in the DEBUG threaded headful and headless
variant and it passes.
- I ran the RELEASE itimer headful and headless variant and it passes
- I ran the RELEASE threaded headless variant and it FAILED as on the CI
- I ran the RELEASE threaded headful variant and it FAILED LESS I mean: testHandlingNotification passed, and so did
testValueWithinTimingBasic and testValueWithinTimingNestedInner
but testValueWithinTimingNestedOuter testValueWithinTimingRepeat
still fail!
So there are discrepancies between debug and release and headful and headless (at least for threaded release)
TL;DR: The linux x86_32 cog v3 threaded release vm has a timing problem
...
Does that help anyone?
If you add code to extract the number of ioProcessEvents calls etc (see About Squeak VM parameters tab for the relevant info) does that tell us more? IIRC one available vm parameter is the number of heartbeats. So we should be able to see if it is the heartbeat itself that is failing or if it is further up stream.
Hi Eliot, I instrumented the code like that:
testHandlingNotification | receivedSignal resumed | receivedSignal := resumed := false. [ | count | "client-code puts up progress, and signals some notications" Transcript cr; nextPutAll: 'test begins:'; print: Time utcMicrosecondClock \ 3600000000. count := 0. 'doing something' displayProgressFrom: 0 to: 10 during: [ : bar | 10 timesRepeat: [ bar value: (count := count + 1). Transcript cr; nextPutAll: 'count:'; print: count. Transcript cr; nextPutAll: 'before wait:'; print: Time utcMicrosecondClock \ 3600000000. (Delay forMilliseconds: 200) wait. Transcript cr; nextPutAll: 'after wait:'; print: Time utcMicrosecondClock \ 3600000000. Notification signal: 'message'. Transcript cr; nextPutAll: 'after notified:'; print: Time utcMicrosecondClock \ 3600000000. Transcript cr; nextPutAll: '# ioProcessEvents: '; print: (Smalltalk vmParameterAt: 57). Transcript cr; nextPutAll: '# forceInterruptCheck: '; print: (Smalltalk vmParameterAt: 58). Transcript cr; nextPutAll: '# check event calls: '; print: (Smalltalk vmParameterAt: 59). resumed := true ] ] ] on: Notification do: [ : noti | receivedSignal := true. Transcript cr; nextPutAll: 'notified at:'; print: Time utcMicrosecondClock \ 3600000000. noti resume ]. Transcript endEntry. self assert: receivedSignal ; assert: resumed
The result I obtain with an updated trunk image and 64bits threaded VM: build.linux64x64/squeak.cog.spur/build$ ./squeak ../../../image/trunk6-64.image
test begins:80631205 notified at:80631251 count:1 before wait:80633324 after wait:80834614 after notified:80834670 # ioProcessEvents: 94 # forceInterruptCheck: 3368 # check event calls: 815 count:2 before wait:80835008 after wait:81036398 after notified:81036454 # ioProcessEvents: 103 # forceInterruptCheck: 3459 # check event calls: 906 count:3 before wait:81036755 after wait:81238355 after notified:81238374 # ioProcessEvents: 113 # forceInterruptCheck: 3549 # check event calls: 996 count:4 before wait:81238633 after wait:81440889 after notified:81440909 # ioProcessEvents: 123 # forceInterruptCheck: 3639 # check event calls: 1086 count:5 before wait:81441151 after wait:82609021 after notified:82609053 # ioProcessEvents: 132 # forceInterruptCheck: 4119 # check event calls: 1161 count:6 before wait:82609520 after wait:83611465 after notified:83611497 # ioProcessEvents: 133 # forceInterruptCheck: 4536 # check event calls: 1162 count:7 before wait:83611974 after wait:84614429 after notified:84614462 # ioProcessEvents: 134 # forceInterruptCheck: 4961 # check event calls: 1165 count:8 before wait:84614926 after wait:85617377 after notified:85617404 # ioProcessEvents: 135 # forceInterruptCheck: 5368 # check event calls: 1166 count:9 before wait:85617840 after wait:86618396 after notified:86618428 # ioProcessEvents: 136 # forceInterruptCheck: 5776 # check event calls: 1167 count:10 before wait:86618896 after wait:87620232 after notified:87620291 # ioProcessEvents: 137 # forceInterruptCheck: 6196 # check event calls: 1168
Notice 10 ioProcessEvents and about 200ms wait on loop 1-5. Then a single ioProcessEvent and about 1000ms wait on loops 6-10 (single check event call too).
Le mer. 28 oct. 2020 à 22:11, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> a écrit :
Le mer. 28 oct. 2020 à 17:23, Eliot Miranda eliot.miranda@gmail.com a écrit :
Hi Tobi, Hi Levente,
On Oct 28, 2020, at 7:06 AM, Tobias Pape Das.Linux@gmx.de wrote:
Hi
> On 27.10.2020, at 23:39, Nicolas Cellier <
nicolas.cellier.aka.nice@gmail.com> wrote:
Hi Tim, what is the result on that box with threaded heartbeat VM of: [SUnitToolBuilderTests new testHandlingNotification] timeToRun. If it is more than 5000 ms, then you would confirm the problem that I
encounter.
This testHandlingNotification is repeating 10 loops with a 200ms delay
wait.
The problem I've got with threaded heartbeat VM is that first 5 delays
run normally, but next 5 will last for 1 second instead of 200ms...
This is probably what happens on the CI server too (the test times out
and CI fails).
Also note that the CI test builds both vms but uses the last one built
(because it overwrites the first), and that happens to be the itimer one, not th ethreaded.
The code above runs in around ~2000 ms on my machine (~2015, with
ubuntu 18.04)
Things that happened:
- I just ran the test suite in the DEBUG itimer headful and headless
variant and it passes.
- I just ran the test suite in the DEBUG threaded headful and headless
variant and it passes.
- I ran the RELEASE itimer headful and headless variant and it passes
- I ran the RELEASE threaded headless variant and it FAILED as on the CI
- I ran the RELEASE threaded headful variant and it FAILED LESS I mean: testHandlingNotification passed, and so did
testValueWithinTimingBasic and testValueWithinTimingNestedInner
but testValueWithinTimingNestedOuter testValueWithinTimingRepeat
still fail!
So there are discrepancies between debug and release and headful and headless (at least for threaded release)
TL;DR: The linux x86_32 cog v3 threaded release vm has a timing problem
...
Does that help anyone?
If you add code to extract the number of ioProcessEvents calls etc (see About Squeak VM parameters tab for the relevant info) does that tell us more? IIRC one available vm parameter is the number of heartbeats. So we should be able to see if it is the heartbeat itself that is failing or if it is further up stream.
Hi Eliot, I instrumented the code like that:
testHandlingNotification | receivedSignal resumed | receivedSignal := resumed := false. [ | count | "client-code puts up progress, and signals some notications" Transcript cr; nextPutAll: 'test begins:'; print: Time utcMicrosecondClock \ 3600000000. count := 0. 'doing something' displayProgressFrom: 0 to: 10 during: [ : bar | 10 timesRepeat: [ bar value: (count := count + 1). Transcript cr; nextPutAll: 'count:'; print: count. Transcript cr; nextPutAll: 'before wait:'; print: Time utcMicrosecondClock \ 3600000000. (Delay forMilliseconds: 200) wait. Transcript cr; nextPutAll: 'after wait:'; print: Time utcMicrosecondClock \ 3600000000. Notification signal: 'message'. Transcript cr; nextPutAll: 'after notified:'; print: Time utcMicrosecondClock \ 3600000000. Transcript cr; nextPutAll: '# ioProcessEvents: '; print: (Smalltalk vmParameterAt: 57). Transcript cr; nextPutAll: '# forceInterruptCheck: '; print: (Smalltalk vmParameterAt: 58). Transcript cr; nextPutAll: '# check event calls: '; print: (Smalltalk vmParameterAt: 59). resumed := true ] ] ] on: Notification do: [ : noti | receivedSignal := true. Transcript cr; nextPutAll: 'notified at:'; print: Time utcMicrosecondClock \ 3600000000. noti resume ]. Transcript endEntry. self assert: receivedSignal ; assert: resumed
The result I obtain with an updated trunk image and 64bits threaded VM: build.linux64x64/squeak.cog.spur/build$ ./squeak ../../../image/trunk6-64.image
test begins:80631205 notified at:80631251 count:1 before wait:80633324 after wait:80834614 after notified:80834670 # ioProcessEvents: 94 # forceInterruptCheck: 3368 # check event calls: 815 count:2 before wait:80835008 after wait:81036398 after notified:81036454 # ioProcessEvents: 103 # forceInterruptCheck: 3459 # check event calls: 906 count:3 before wait:81036755 after wait:81238355 after notified:81238374 # ioProcessEvents: 113 # forceInterruptCheck: 3549 # check event calls: 996 count:4 before wait:81238633 after wait:81440889 after notified:81440909 # ioProcessEvents: 123 # forceInterruptCheck: 3639 # check event calls: 1086 count:5 before wait:81441151 after wait:82609021 after notified:82609053 # ioProcessEvents: 132 # forceInterruptCheck: 4119 # check event calls: 1161 count:6 before wait:82609520 after wait:83611465 after notified:83611497 # ioProcessEvents: 133 # forceInterruptCheck: 4536 # check event calls: 1162 count:7 before wait:83611974 after wait:84614429 after notified:84614462 # ioProcessEvents: 134 # forceInterruptCheck: 4961 # check event calls: 1165 count:8 before wait:84614926 after wait:85617377 after notified:85617404 # ioProcessEvents: 135 # forceInterruptCheck: 5368 # check event calls: 1166 count:9 before wait:85617840 after wait:86618396 after notified:86618428 # ioProcessEvents: 136 # forceInterruptCheck: 5776 # check event calls: 1167 count:10 before wait:86618896 after wait:87620232 after notified:87620291 # ioProcessEvents: 137 # forceInterruptCheck: 6196 # check event calls: 1168
Notice 10 ioProcessEvents and about 200ms wait on loop 1-5. Then a single ioProcessEvent and about 1000ms wait on loops 6-10 (single check event call too).
and on itimer heatbeat, 200ms per loop, 10 ioProcessEvents, 100 forceInterruptCheck and 100 check event calls This is consistent with the 2ms heartbeat (vmParameter #26). linux64x64/squeak.cog.spur/build.itimerheartbeat$ ./squeak ../../../image/trunk6-64.image
test begins:1013906366 notified at:1013906407 count:1 before wait:1013910362 after wait:1014112028 after notified:1014112085 # ioProcessEvents: 208 # forceInterruptCheck: 2299 # check event calls: 2236 count:2 before wait:1014112443 after wait:1014314011 after notified:1014314062 # ioProcessEvents: 218 # forceInterruptCheck: 2400 # check event calls: 2337 count:3 before wait:1014314405 after wait:1014516017 after notified:1014516052 # ioProcessEvents: 228 # forceInterruptCheck: 2501 # check event calls: 2438 count:4 before wait:1014516326 after wait:1014718011 after notified:1014718039 # ioProcessEvents: 237 # forceInterruptCheck: 2602 # check event calls: 2539 count:5 before wait:1014718296 after wait:1014920490 after notified:1014920521 # ioProcessEvents: 247 # forceInterruptCheck: 2703 # check event calls: 2640 count:6 before wait:1014920967 after wait:1015122184 after notified:1015122215 # ioProcessEvents: 256 # forceInterruptCheck: 2804 # check event calls: 2741 count:7 before wait:1015122608 after wait:1015324166 after notified:1015324198 # ioProcessEvents: 266 # forceInterruptCheck: 2905 # check event calls: 2841 count:8 before wait:1015324595 after wait:1015526278 after notified:1015526299 # ioProcessEvents: 275 # forceInterruptCheck: 3006 # check event calls: 2942 count:9 before wait:1015526556 after wait:1015728488 after notified:1015728518 # ioProcessEvents: 285 # forceInterruptCheck: 3107 # check event calls: 3042 count:10 before wait:1015728947 after wait:1015930499 after notified:1015930530 # ioProcessEvents: 295 # forceInterruptCheck: 3208 # check event calls: 3143
Hi Nicolas,
On Wed, Oct 28, 2020 at 2:21 PM Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
Le mer. 28 oct. 2020 à 22:11, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> a écrit :
Le mer. 28 oct. 2020 à 17:23, Eliot Miranda eliot.miranda@gmail.com a écrit :
Hi Tobi, Hi Levente,
On Oct 28, 2020, at 7:06 AM, Tobias Pape Das.Linux@gmx.de wrote:
Hi
>> On 27.10.2020, at 23:39, Nicolas Cellier <
nicolas.cellier.aka.nice@gmail.com> wrote:
Hi Tim, what is the result on that box with threaded heartbeat VM of: [SUnitToolBuilderTests new testHandlingNotification] timeToRun. If it is more than 5000 ms, then you would confirm the problem that I
encounter.
This testHandlingNotification is repeating 10 loops with a 200ms
delay wait.
The problem I've got with threaded heartbeat VM is that first 5
delays run normally, but next 5 will last for 1 second instead of 200ms...
This is probably what happens on the CI server too (the test times
out and CI fails).
Also note that the CI test builds both vms but uses the last one built
(because it overwrites the first), and that happens to be the itimer one, not th ethreaded.
The code above runs in around ~2000 ms on my machine (~2015, with
ubuntu 18.04)
Things that happened:
- I just ran the test suite in the DEBUG itimer headful and headless
variant and it passes.
- I just ran the test suite in the DEBUG threaded headful and headless
variant and it passes.
- I ran the RELEASE itimer headful and headless variant and it passes
- I ran the RELEASE threaded headless variant and it FAILED as on the
CI
- I ran the RELEASE threaded headful variant and it FAILED LESS I mean: testHandlingNotification passed, and so did
testValueWithinTimingBasic and testValueWithinTimingNestedInner
but testValueWithinTimingNestedOuter testValueWithinTimingRepeat
still fail!
So there are discrepancies between debug and release and headful and headless (at least for threaded release)
TL;DR: The linux x86_32 cog v3 threaded release vm has a timing
problem ...
Does that help anyone?
If you add code to extract the number of ioProcessEvents calls etc (see About Squeak VM parameters tab for the relevant info) does that tell us more? IIRC one available vm parameter is the number of heartbeats. So we should be able to see if it is the heartbeat itself that is failing or if it is further up stream.
Hi Eliot, I instrumented the code like that:
testHandlingNotification | receivedSignal resumed | receivedSignal := resumed := false. [ | count | "client-code puts up progress, and signals some notications" Transcript cr; nextPutAll: 'test begins:'; print: Time utcMicrosecondClock \ 3600000000. count := 0. 'doing something' displayProgressFrom: 0 to: 10 during: [ : bar | 10 timesRepeat: [ bar value: (count := count + 1). Transcript cr; nextPutAll: 'count:'; print: count. Transcript cr; nextPutAll: 'before wait:'; print: Time utcMicrosecondClock \ 3600000000. (Delay forMilliseconds: 200) wait. Transcript cr; nextPutAll: 'after wait:'; print: Time utcMicrosecondClock \ 3600000000. Notification signal: 'message'. Transcript cr; nextPutAll: 'after notified:'; print: Time utcMicrosecondClock \ 3600000000. Transcript cr; nextPutAll: '# ioProcessEvents: '; print: (Smalltalk vmParameterAt: 57). Transcript cr; nextPutAll: '# forceInterruptCheck: '; print: (Smalltalk vmParameterAt: 58). Transcript cr; nextPutAll: '# check event calls: '; print: (Smalltalk vmParameterAt: 59). resumed := true ] ] ] on: Notification do: [ : noti | receivedSignal := true. Transcript cr; nextPutAll: 'notified at:'; print: Time utcMicrosecondClock \ 3600000000. noti resume ]. Transcript endEntry. self assert: receivedSignal ; assert: resumed
The result I obtain with an updated trunk image and 64bits threaded VM: build.linux64x64/squeak.cog.spur/build$ ./squeak ../../../image/trunk6-64.image
test begins:80631205 notified at:80631251 count:1 before wait:80633324 after wait:80834614 after notified:80834670 # ioProcessEvents: 94 # forceInterruptCheck: 3368 # check event calls: 815 count:2 before wait:80835008 after wait:81036398 after notified:81036454 # ioProcessEvents: 103 # forceInterruptCheck: 3459 # check event calls: 906 count:3 before wait:81036755 after wait:81238355 after notified:81238374 # ioProcessEvents: 113 # forceInterruptCheck: 3549 # check event calls: 996 count:4 before wait:81238633 after wait:81440889 after notified:81440909 # ioProcessEvents: 123 # forceInterruptCheck: 3639 # check event calls: 1086 count:5 before wait:81441151 after wait:82609021 after notified:82609053 # ioProcessEvents: 132 # forceInterruptCheck: 4119 # check event calls: 1161 count:6 before wait:82609520 after wait:83611465 after notified:83611497 # ioProcessEvents: 133 # forceInterruptCheck: 4536 # check event calls: 1162 count:7 before wait:83611974 after wait:84614429 after notified:84614462 # ioProcessEvents: 134 # forceInterruptCheck: 4961 # check event calls: 1165 count:8 before wait:84614926 after wait:85617377 after notified:85617404 # ioProcessEvents: 135 # forceInterruptCheck: 5368 # check event calls: 1166 count:9 before wait:85617840 after wait:86618396 after notified:86618428 # ioProcessEvents: 136 # forceInterruptCheck: 5776 # check event calls: 1167 count:10 before wait:86618896 after wait:87620232 after notified:87620291 # ioProcessEvents: 137 # forceInterruptCheck: 6196 # check event calls: 1168
Notice 10 ioProcessEvents and about 200ms wait on loop 1-5. Then a single ioProcessEvent and about 1000ms wait on loops 6-10 (single check event call too).
and on itimer heatbeat, 200ms per loop, 10 ioProcessEvents, 100 forceInterruptCheck and 100 check event calls This is consistent with the 2ms heartbeat (vmParameter #26). linux64x64/squeak.cog.spur/build.itimerheartbeat$ ./squeak ../../../image/trunk6-64.image
test begins:1013906366 notified at:1013906407 count:1 before wait:1013910362 after wait:1014112028 after notified:1014112085 # ioProcessEvents: 208 # forceInterruptCheck: 2299 # check event calls: 2236 count:2 before wait:1014112443 after wait:1014314011 after notified:1014314062 # ioProcessEvents: 218 # forceInterruptCheck: 2400 # check event calls: 2337 count:3 before wait:1014314405 after wait:1014516017 after notified:1014516052 # ioProcessEvents: 228 # forceInterruptCheck: 2501 # check event calls: 2438 count:4 before wait:1014516326 after wait:1014718011 after notified:1014718039 # ioProcessEvents: 237 # forceInterruptCheck: 2602 # check event calls: 2539 count:5 before wait:1014718296 after wait:1014920490 after notified:1014920521 # ioProcessEvents: 247 # forceInterruptCheck: 2703 # check event calls: 2640 count:6 before wait:1014920967 after wait:1015122184 after notified:1015122215 # ioProcessEvents: 256 # forceInterruptCheck: 2804 # check event calls: 2741 count:7 before wait:1015122608 after wait:1015324166 after notified:1015324198 # ioProcessEvents: 266 # forceInterruptCheck: 2905 # check event calls: 2841 count:8 before wait:1015324595 after wait:1015526278 after notified:1015526299 # ioProcessEvents: 275 # forceInterruptCheck: 3006 # check event calls: 2942 count:9 before wait:1015526556 after wait:1015728488 after notified:1015728518 # ioProcessEvents: 285 # forceInterruptCheck: 3107 # check event calls: 3042 count:10 before wait:1015728947 after wait:1015930499 after notified:1015930530 # ioProcessEvents: 295 # forceInterruptCheck: 3208 # check event calls: 3143
Thanks that's great. SO the problem is nothing to do with the heartbeat. It seems to be working perfectly OK, right? _,,,^..^,,,_ best, Eliot
Hi Eliot,
Le mer. 28 oct. 2020 à 23:56, Eliot Miranda eliot.miranda@gmail.com a écrit :
Hi Nicolas,
On Wed, Oct 28, 2020 at 2:21 PM Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
Le mer. 28 oct. 2020 à 22:11, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> a écrit :
Le mer. 28 oct. 2020 à 17:23, Eliot Miranda eliot.miranda@gmail.com a écrit :
Hi Tobi, Hi Levente,
On Oct 28, 2020, at 7:06 AM, Tobias Pape Das.Linux@gmx.de wrote:
Hi
>>> On 27.10.2020, at 23:39, Nicolas Cellier <
nicolas.cellier.aka.nice@gmail.com> wrote:
Hi Tim, what is the result on that box with threaded heartbeat VM of: [SUnitToolBuilderTests new testHandlingNotification] timeToRun. If it is more than 5000 ms, then you would confirm the problem that
I encounter.
This testHandlingNotification is repeating 10 loops with a 200ms
delay wait.
The problem I've got with threaded heartbeat VM is that first 5
delays run normally, but next 5 will last for 1 second instead of 200ms...
This is probably what happens on the CI server too (the test times
out and CI fails).
Also note that the CI test builds both vms but uses the last one
built (because it overwrites the first), and that happens to be the itimer one, not th ethreaded.
The code above runs in around ~2000 ms on my machine (~2015, with
ubuntu 18.04)
Things that happened:
- I just ran the test suite in the DEBUG itimer headful and headless
variant and it passes.
- I just ran the test suite in the DEBUG threaded headful and
headless variant and it passes.
- I ran the RELEASE itimer headful and headless variant and it passes
- I ran the RELEASE threaded headless variant and it FAILED as on the
CI
- I ran the RELEASE threaded headful variant and it FAILED LESS I mean: testHandlingNotification passed, and so did
testValueWithinTimingBasic and testValueWithinTimingNestedInner
but testValueWithinTimingNestedOuter testValueWithinTimingRepeat
still fail!
So there are discrepancies between debug and release and headful and headless (at least for threaded release)
TL;DR: The linux x86_32 cog v3 threaded release vm has a timing
problem ...
Does that help anyone?
If you add code to extract the number of ioProcessEvents calls etc (see About Squeak VM parameters tab for the relevant info) does that tell us more? IIRC one available vm parameter is the number of heartbeats. So we should be able to see if it is the heartbeat itself that is failing or if it is further up stream.
Hi Eliot, I instrumented the code like that:
testHandlingNotification | receivedSignal resumed | receivedSignal := resumed := false. [ | count | "client-code puts up progress, and signals some notications" Transcript cr; nextPutAll: 'test begins:'; print: Time utcMicrosecondClock \ 3600000000. count := 0. 'doing something' displayProgressFrom: 0 to: 10 during: [ : bar | 10 timesRepeat: [ bar value: (count := count + 1). Transcript cr; nextPutAll: 'count:'; print: count. Transcript cr; nextPutAll: 'before wait:'; print: Time utcMicrosecondClock \ 3600000000. (Delay forMilliseconds: 200) wait. Transcript cr; nextPutAll: 'after wait:'; print: Time utcMicrosecondClock \ 3600000000. Notification signal: 'message'. Transcript cr; nextPutAll: 'after notified:'; print: Time utcMicrosecondClock \ 3600000000. Transcript cr; nextPutAll: '# ioProcessEvents: '; print: (Smalltalk vmParameterAt: 57). Transcript cr; nextPutAll: '# forceInterruptCheck: '; print: (Smalltalk vmParameterAt: 58). Transcript cr; nextPutAll: '# check event calls: '; print: (Smalltalk vmParameterAt: 59). resumed := true ] ] ] on: Notification do: [ : noti | receivedSignal := true. Transcript cr; nextPutAll: 'notified at:'; print: Time utcMicrosecondClock \ 3600000000. noti resume ]. Transcript endEntry. self assert: receivedSignal ; assert: resumed
The result I obtain with an updated trunk image and 64bits threaded VM: build.linux64x64/squeak.cog.spur/build$ ./squeak ../../../image/trunk6-64.image
test begins:80631205 notified at:80631251 count:1 before wait:80633324 after wait:80834614 after notified:80834670 # ioProcessEvents: 94 # forceInterruptCheck: 3368 # check event calls: 815 count:2 before wait:80835008 after wait:81036398 after notified:81036454 # ioProcessEvents: 103 # forceInterruptCheck: 3459 # check event calls: 906 count:3 before wait:81036755 after wait:81238355 after notified:81238374 # ioProcessEvents: 113 # forceInterruptCheck: 3549 # check event calls: 996 count:4 before wait:81238633 after wait:81440889 after notified:81440909 # ioProcessEvents: 123 # forceInterruptCheck: 3639 # check event calls: 1086 count:5 before wait:81441151 after wait:82609021 after notified:82609053 # ioProcessEvents: 132 # forceInterruptCheck: 4119 # check event calls: 1161 count:6 before wait:82609520 after wait:83611465 after notified:83611497 # ioProcessEvents: 133 # forceInterruptCheck: 4536 # check event calls: 1162 count:7 before wait:83611974 after wait:84614429 after notified:84614462 # ioProcessEvents: 134 # forceInterruptCheck: 4961 # check event calls: 1165 count:8 before wait:84614926 after wait:85617377 after notified:85617404 # ioProcessEvents: 135 # forceInterruptCheck: 5368 # check event calls: 1166 count:9 before wait:85617840 after wait:86618396 after notified:86618428 # ioProcessEvents: 136 # forceInterruptCheck: 5776 # check event calls: 1167 count:10 before wait:86618896 after wait:87620232 after notified:87620291 # ioProcessEvents: 137 # forceInterruptCheck: 6196 # check event calls: 1168
Notice 10 ioProcessEvents and about 200ms wait on loop 1-5. Then a single ioProcessEvent and about 1000ms wait on loops 6-10 (single check event call too).
and on itimer heatbeat, 200ms per loop, 10 ioProcessEvents, 100 forceInterruptCheck and 100 check event calls This is consistent with the 2ms heartbeat (vmParameter #26). linux64x64/squeak.cog.spur/build.itimerheartbeat$ ./squeak ../../../image/trunk6-64.image
test begins:1013906366 notified at:1013906407 count:1 before wait:1013910362 after wait:1014112028 after notified:1014112085 # ioProcessEvents: 208 # forceInterruptCheck: 2299 # check event calls: 2236 count:2 before wait:1014112443 after wait:1014314011 after notified:1014314062 # ioProcessEvents: 218 # forceInterruptCheck: 2400 # check event calls: 2337 count:3 before wait:1014314405 after wait:1014516017 after notified:1014516052 # ioProcessEvents: 228 # forceInterruptCheck: 2501 # check event calls: 2438 count:4 before wait:1014516326 after wait:1014718011 after notified:1014718039 # ioProcessEvents: 237 # forceInterruptCheck: 2602 # check event calls: 2539 count:5 before wait:1014718296 after wait:1014920490 after notified:1014920521 # ioProcessEvents: 247 # forceInterruptCheck: 2703 # check event calls: 2640 count:6 before wait:1014920967 after wait:1015122184 after notified:1015122215 # ioProcessEvents: 256 # forceInterruptCheck: 2804 # check event calls: 2741 count:7 before wait:1015122608 after wait:1015324166 after notified:1015324198 # ioProcessEvents: 266 # forceInterruptCheck: 2905 # check event calls: 2841 count:8 before wait:1015324595 after wait:1015526278 after notified:1015526299 # ioProcessEvents: 275 # forceInterruptCheck: 3006 # check event calls: 2942 count:9 before wait:1015526556 after wait:1015728488 after notified:1015728518 # ioProcessEvents: 285 # forceInterruptCheck: 3107 # check event calls: 3042 count:10 before wait:1015728947 after wait:1015930499 after notified:1015930530 # ioProcessEvents: 295 # forceInterruptCheck: 3208 # check event calls: 3143
Thanks that's great. SO the problem is nothing to do with the heartbeat. It seems to be working perfectly OK, right? _,,,^..^,,,_ best, Eliot
Well itimer flavour works fine, but threaded not so much. Why is there 1000ms wait per loop instead of 200ms, and a single check for event and ioProcessEvent per loop instead of 10 starting at loop #6? Or do you mean that forceInterruptCheck count remains correct? (about 90 interrupt/200ms then 420 interrupt/1000ms - compared to more regular 100 interrupt/200ms in itimer flavour)
vm-dev@lists.squeakfoundation.org