Yes, bits get pinned via beDisplay. We just have to ensure strong references from within the image.

Best,
Marcel


From: Juan Vuletich <juan@cuis.st>
Sent: Sunday, December 10, 2023 2:45:47 PM
To: Open Smalltalk Virtual Machine Development Discussion <vm-dev@lists.squeakfoundation.org>
Cc: Taeumel, Marcel <Marcel.Taeumel@hpi.de>
Subject: Re: [Vm-dev] Re: Please test | Relase candidate for OSVM 2023
 
Hi Marcel,

Thank you very much for making us aware of this!

Indeed I was not aware of the "new" requirement to call #beDisplay whenever Display bits change, and to keep the old instance around until this is done.

I'll do the appropriate changes to Cuis soon, hopefully tomorrow.

BTW, I guess this means that the Display bits are also pinned by the VM upon calling #beDisplay, right? Otherwise the GC would move them, and the crash would happen all the time.

Thanks a lot!

Cheers,

On 12/9/2023 4:29 AM, Taeumel, Marcel wrote:
 


If I remember correctly, the change to the macOS Metal support in OSVM (2022-06) was like this:
- use event mechanism (i.e. paint events) instead of raw Metal pipeline to avoid locking image performance to 60 FPS max
- OSVM does a lot of "event pumping" at unexpected places, probably in support of an input-event semaphore and thus a reactive in-image GUI framework (instead of cyclic Morphic, which polls events on its own)

Now the issue is with resizing DisplayScreen, which happens on snapshotting. There, bits get minimized to not have, e.g., 4K forms (8 meg?) in the file but only a smaller representative. And this is where the GC might collect those 8 meg, the VM does not know about the new bits yet, but unexpected "event pumping" triggers a repaint event via metal and thus a dangling "bits" pointer is processed in our metal backend. The VM crashes.

See the wording "does not slow down VM interpreter loop" here:
https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/620

Those bits in DisplayScreen must be protected whenever screen-extent is changed from within the image such as on snapshotting.

Best,
Marcel

Am 07.12.2023 13:10:22 schrieb Marcel <marcel.taeumel@hpi.uni-potsdam.de>:

Aha!

If this was in a Cuis image, that image probably did not get the fix we did in DisplayScreen >> restore

restore
    
| priorBits |
    
priorBits := bits. "Must avoid to be GC'ed!"
        
self setExtent: self class actualScreenSize depth: self nativeDepth.
        
self beDisplay.
    
priorBits := nil. "Documentation only."
    
Project current ifNotNil: [:p| p displaySizeChanged].

For Cuis 5.0 and Cuis 6.0, this fix should probably be in DisplayScreen >> setExtent:depth:, where bits is cleared and thus can be GC'ed .

No, this cannot be fixed only in the VM platform code. The image allocates memory for the bits and communicates a pointer to that to the platform code via #beDisplay. Whenever those bits change, new bits must be communicated via #beDisplay *before* the GC collects the old bits. Image code must ensure that with a strong reference.

Well, this is no new behavior. The issue can also occur with the 2022-06 OSVM. But it is (still) rare. Probably related to how much old space the image uses. Definitely GC-related.

Best,
Marcel

Am 06.12.2023 21:12:16 schrieb Eliot Miranda <eliot.miranda@gmail.com>:



-- 
Juan Vuletich
cuis.st
github.com/jvuletich
researchgate.net/profile/Juan-Vuletich
independent.academia.edu/JuanVuletich
patents.justia.com/inventor/juan-manuel-vuletich
linkedin.com/in/juan-vuletich-75611b3
twitter.com/JuanVuletich