Hi all,
I think this is not a new issue, but I am experiencing it with increasing frequency on my Win 10 machine: When instantiating SqueakSSL, primitiveCreate fails. I don't see an error in the console so I cannot tell more details about the error. Sometimes restarting my image helps, but right now, it does not. Other images running with the same VM (202010232046) can still access the internet via SSL.
Maybe it plays a role that the affected image puts the plugin much more under stress by always holding at least one open connection to a server.
Is this a known problem? Are there any workarounds?
Looking forward to a fix, Christoph
Oh no, this time the issue even persists after rebooting my operating system. `SqueakSSL new` just always fails. Exchanging the VM did not help; exchanging the image helped. Also, the failure occurs independently of whether my device is connected to any network or whether it isn't. So I guess my image has some kind of corrupted state? Is there some network/SSL configuration on the image side that I can reset in any way?
Well, I just found out that neither `Smalltalk listBuiltinModules` nor `Smalltalk listLoadedModules` contains the `SqueakSSL` plugin that is used by `#primitiveSSLCreate`. But it is also not loaded in a fresh trunk image where SSL works without any problems. Is something wrong with this? How could I reload the plugin?
On 2021-03-13, at 6:35 AM, Christoph Thiede notifications@github.com wrote:
Well, I just found out that neither Smalltalk listBuiltinModules nor Smalltalk listLoadedModules contains the SqueakSSL plugin that is used by #primitiveSSLCreate. But it is also not loaded in a fresh trunk image where SSL works without any problems. Is something wrong with this? How could I reload the plugin?
It's an external plugin on all my systems - Mac & linux - so you shouldn't see it in the listBuiltinModules list. If I start a fresh image, the SSL plugin is not loaded. If I run the SqueakSSLTest tests in the TestRunner, they all pass and the plugin is in the listLoadedModules list.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: WDS: Warp Drive, Scotty!
Christoph, where does your SqueakSSL.dll live? Is it the right architecture (32/64-bit)?
It's an external plugin on all my systems - Mac & linux - so you shouldn't see it in the listBuiltinModules list. If I start a fresh image, the SSL plugin is not loaded. If I run the SqueakSSLTest tests in the TestRunner, they all pass and the plugin is in the listLoadedModules list.
Thanks, Tim, this description makes sense and it looks the same in my fresh image! So the problem is that my problematic image does not automatically load the SSL plugin when invoking the primitive?
Christoph, where does your SqueakSSL.dll live? Is it the right architecture (32/64-bit)?
It's both 64-bit.
I have just made two more observations:
1. I patched `primitiveSSLCreate` and added an `error: ec` argument to the primitive pragma. When the primitive fails, `ec` is set to `#'not found'`. 2. I used another VM (202003021730 instead of 202010232046) to open my image and it shows two interesting observations: First, just during the start-up phase during `ThisOSProcess` is started up, and second, when evaluating `SqueakSSL new`, a number of access violation errors pop out in the debug console of the VM: ``` LoadLibrary(Win32OSProcessPlugin) (998: Unzuläÿssiger Zugriff auf einen Speicherbereich.
) LoadLibrary(Win32OSProcessPlugin.dll) (998: Unzuläÿssiger Zugriff auf einen Speicherbereich.
) # about 12 repetitions of the above ... LoadLibrary(SqueakSSL) (998: Unzuläÿssiger Zugriff auf einen Speicherbereich.
) LoadLibrary(SqueakSSL.dll) (998: Unzuläÿssiger Zugriff auf einen Speicherbereich.
) ``` "Unzuläÿssiger Zugriff auf einen Speicherbereich" is German for "Invalid access to memory location" (by the way, the encoding is wrong, `äÿ` should be only `ä`).
There has already been a similar issue reported on the mailing-list, but apparently, it has not yet been solved: http://lists.squeakfoundation.org/pipermail/vm-dev/2017-June/025516.html
Can you please tell where the SqueakSSL.dll is located? and maybe find out which version you have?
Can you please tell where the SqueakSSL.dll is located?
In my default configuration, the locations are as follows:
``` C:\Program Files (x86)\Squeak\SqueakSSL.dll C:\Program Files (x86)\Squeak\Squeak.exe C:\Users\Christoph\OneDrive\Dokumente\Squeak\ChristophTrunk.image ```
However, the problem still exists when I change the paths as follows:
``` C:\sq\SqueakSSL.dll C:\sq\Squeak.exe C:\ChristophTrunk.image ```
and maybe find out which version you have?
If I run a fresh trunk image with Squeak.exe, after evaluating `SqueakSSL new`, the SystemReporter says `SqueakSSL VMMaker.oscog-eem.2805 (e)` resp. `SqueakSSL VMMaker.oscog-eem.2673 (e)` (I tried two VM versions).
In the affected image, the module does not appear in the list.
Here is a stack trace from Visual Studio (collected by attaching the debugger to running process):
``` 0xC0000005: Access violating when writing at position 0xFFFFFFFFA05CB32C.
ntdll.dll!LdrpAllocateTlsEntry() ntdll.dll!LdrpHandleTlsData() ntdll.dll!LdrpDoPostSnapWork() ntdll.dll!LdrpSnapModule() ntdll.dll!LdrpMapAndSnapDependency() ntdll.dll!LdrpMapDllWithSectionHandle() ntdll.dll!LdrpMapDllNtFileName() ntdll.dll!LdrpMapDllRetry() ntdll.dll!LdrpProcessWork() ntdll.dll!LdrpDrainWorkQueue() ntdll.dll!LdrpLoadDllInternal() ntdll.dll!LdrpLoadDll() ntdll.dll!LdrLoadDll() KernelBase.dll!LoadLibraryExW() KernelBase.dll!LoadLibraryExA() KernelBase.dll!LoadLibraryA() Squeak.exe!00000000004aa129() Squeak.exe!00000000004b8cd7() Squeak.exe!00000000004b9062() Squeak.exe!000000000041c642() Squeak.exe!0000000000407926() Squeak.exe!0000000000411c9e() Squeak.exe!0000000000401d3c() Squeak.exe!00000000004aea6b() Squeak.exe!00000000004aed8f() Squeak.exe!00000000004013c7() Squeak.exe!00000000004014cb() kernel32.dll!BaseThreadInitThunk() ntdll.dll!RtlUserThreadStart() ```
Is this helpful in any way? :-)
can you do `dumpbin /exports` on the DLL?
Also, do you get such a backtrace when you invoke the method of UUID that contains `primitiveMakeUUID` ?
can you do `dumpbin /exports` on the DLL?
202010232046: <details><pre><code> 0.00 version 1 ordinal base 12 number of functions 12 number of names
ordinal hint RVA name
12 0 000013A0 getModuleName 2 1 000013B0 primitiveAccept = printf 3 2 000014F0 primitiveConnect 4 3 00001630 primitiveCreate 5 4 00001680 primitiveDecrypt 6 5 000017C0 primitiveDestroy 7 6 00001820 primitiveEncrypt 8 7 00001960 primitiveGetIntProperty 9 8 000019F0 primitiveGetStringProperty 10 9 00001B40 primitiveSetIntProperty 11 A 00001BE0 primitiveSetStringProperty 1 B 00001CA0 setInterpreter
Summary
1000 .CRT 1000 .bss 1000 .data 4000 .debug_abbrev 1000 .debug_aranges 2000 .debug_frame 44000 .debug_info 9000 .debug_line 13000 .debug_loc 1000 .debug_macinfo 1000 .debug_ranges 3000 .debug_str 1000 .edata 1000 .idata 1000 .pdata 2000 .rdata 1000 .reloc 1000 .rsrc 9000 .text 1000 .tls 1000 .xdata </code></pre></details>
202003021730: <details><pre><code> File Type: DLL
Section contains the following exports for SqueakSSL.dll
00000000 characteristics 5E5E1711 time date stamp Tue Mar 3 09:36:33 2020 0.00 version 1 ordinal base 12 number of functions 12 number of names
ordinal hint RVA name
12 0 000013B0 getModuleName 2 1 000013C0 primitiveAccept = sqSetupCert 3 2 00001500 primitiveConnect 4 3 00001640 primitiveCreate 5 4 00001690 primitiveDecrypt 6 5 000017D0 primitiveDestroy 7 6 00001830 primitiveEncrypt 8 7 00001970 primitiveGetIntProperty 9 8 00001A00 primitiveGetStringProperty 10 9 00001B50 primitiveSetIntProperty 11 A 00001BF0 primitiveSetStringProperty 1 B 00001CB0 setInterpreter
Summary
1000 .CRT 1000 .bss 1000 .data 3000 .debug_abbrev 1000 .debug_aranges 1000 .debug_frame 39000 .debug_info 4000 .debug_line 6000 .debug_loc 1000 .debug_macinfo 1000 .debug_ranges 3000 .debug_str 1000 .edata 1000 .idata 1000 .pdata 1000 .rdata 1000 .reloc 1000 .rsrc 4000 .text 1000 .tls 1000 .xdata </code></pre></details>
Also, do you get such a backtrace when you invoke the method of UUID that contains `primitiveMakeUUID`?
In the 202010232046 VM, `UUID new` does not raise an error but returns a plausible result. In the 202003021730 VM, another `LoadLibrary` warning appears in the console but the expressions answers a plausible result, too. The trace is:
``` 0xC0000005: Access violation when writing at position 0xFFFFFFFF9DAC75EC.
ntdll.dll!LdrpAllocateTlsEntry() ntdll.dll!LdrpHandleTlsData() ntdll.dll!LdrpDoPostSnapWork() ntdll.dll!LdrpSnapModule() ntdll.dll!LdrpMapAndSnapDependency() ntdll.dll!LdrpMapDllWithSectionHandle() ntdll.dll!LdrpMapDllNtFileName() ntdll.dll!LdrpMapDllRetry() ntdll.dll!LdrpProcessWork() ntdll.dll!LdrpDrainWorkQueue() ntdll.dll!LdrpLoadDllInternal() ntdll.dll!LdrpLoadDll() ntdll.dll!LdrLoadDll() KernelBase.dll!LoadLibraryExW() KernelBase.dll!LoadLibraryExA() KernelBase.dll!LoadLibraryA() Squeak.exe!00000000004aa129() Squeak.exe!00000000004b8cd7() Squeak.exe!00000000004b9062() Squeak.exe!000000000041c642() Squeak.exe!0000000000407926() Squeak.exe!0000000000411c9e() Squeak.exe!0000000000401d3c() Squeak.exe!00000000004aea6b() Squeak.exe!00000000004aed8f() Squeak.exe!00000000004013c7() Squeak.exe!00000000004014cb() kernel32.dll!BaseThreadInitThunk() ntdll.dll!RtlUserThreadStart() ```
So yes, the stack appears to be indentical.
so, this hints towars the general Module loading. can you build a debug vm and stop in https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/platforms/win32/v... It would be interesting to see what `libName` is.
Sorry for the late reply. I'm not yet very familiar with building the VM and did not debug it before at all 😅
As far as I can tell, everything looks fine when `LoadLibrary()` is called (I debugged `build.win64x64/squeak.cog.spur/builddbg`):


Still, these calls raise an access violation. The other attempts with `32`/`.dll` postfixes and image path prefix fail, too. As far as I understand it, the screenshotted call should succeed because the `Win32OSProcessPlugin.dll` is in the same directory as `Squeak.exe`.
Does it matter that according to the Process Explorer, the current working directory of Squeak.exe points to the location of the image file rather than to the location of the VM? According to the [docs](https://docs.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-sea...), this should not matter?
By the way, the SSL plugin can be successfully loaded when running the same image in WSL (Ubuntu). So it's indeed a VM problem.
---
BTW: When trying to run a 64-bit image in a 32-bit VM (which does not work, of course), an exception is generated from the following stack:
``` Squeak.exe!abortMessage(char * fmt) Line 63 Squeak.exe!printUsage(int level) Line 3116 Squeak.exe!sqMain(int argc, char * * argv) Line 1636 Squeak.exe!WinMain(HINSTANCE__ * hInst, HINSTANCE__ * hPrevInstance, char * lpCmdLine, int nCmdShow) Line 1779 Squeak.exe!main(int flags, char * * cmdline, char * * inst) Line 20 Squeak.exe!__tmainCRTStartup() Line 341 ```
Shouldn't a meaningful error window be shown here instead? Do you have any idea why this is broken? I also noticed that the `args` variable seems to be corrupted. Or is it just pending to be initialized properly?

I tried to work with the windows stuff today. Since I cannot get anything to work I cant help. sorry
Hm ... for me, the cygwin approach has worked in the end ... anyone else?
Since I also observed this issue with the FFI-Plugin and OSProcess-Plugin under Windows 10, I suspect an issue with module loading in general. On my machine, this sometimes happens after, for example, performing a Windows update. I suspect some DLL caching issue. If I had the time, I would take a closer look at how the VM loads modules on Windows and then cosult the MSDN documentation to then reason about the circumstances and edge cases.
Update:
Because it seemed very strange to me that such a problem would only occur with a specific image, I gradually cleaned up a copy of my affected image today. Despite multiple attempts, I can't really tell in detail what was the problem, but somewhere in my image I had a months-old object explorer on DependentsFields open, and when I close it as well as a few other browsers/workspaces, the problem suddenly disappeared at the next VM restart and module loading works again.
This solves my concrete problem, but not the bug in general, so I would like to continue to investigate. However, after a number of attempts, my debug build of the VM does not any longer fail to find the module (though the release VM still does), so, unfortunately, I cannot continue my experiments right now.
@marceltaeumel I already did some research on this error but did not really find something interesting. Circular dependencies could be a reason, but from what I could find out, the ordered list of loaded modules does not differ between my defect and my cleaned-up image, so this would not make sense. There were also mentioned some other points such as security limitations (memory protection/DEP/ESP, never heard of that before), but I would be surprised if such a mechanism would only take effect sporadically.
Forget most of the above, I just made a very interesting observation! :-)
Currently, the RAM size/disk size of my repaired image is only a couple of megabytes below 2^27 bytes. The defect image is 10 MB larger than this threshold. After loading a 100 MB file into the repaired image and restarting it, the module loading fails again. After freeing the file in the image and restarting it again, the module loading works again.
So the next question is: How could this happen? Are there any RAM size limits that I should be aware of? Or could this even be a bug in the `LoadModule()` function because its error message (E_NOACCESS) does not mention exceeded size limits in any way? I have never dealt with such issues before, maybe you have some tips for me.
Got it! I managed to reproduce the issue!
Steps to reproduce:
1. Open a fresh Trunk image 2. Evaluate in a workspace: ```smalltalk large := Array new: 200000000. ``` 3. Save and quit the image (don't close the workspace) 4. Reopen the image 5. Evaluate: ```smalltalk SqueakSSL new. ```
E voilá, `primitiveSSLCreate` fails.
Can anyone else reproduce this? My config is: Windows 2004, 64-bit VM, 16 GB RAM, could there be more relevant parameters?
Well, I wouldn't immediately celebrate "Got it!" 😄 but yes, I can reproduce this in Squeak 5.3 (64-bit) with the VM 202003021730. So, what does this behavior tell us? At least, it is releated to some kind of memory management. Renaming "Squeak.exe" to something else or using "SqueakConsole.exe" or changing the VM to 202011120327 ... all options still show that bug in that "broken" image.
There is something wrong with module loading. Is there some interference between Squeak's object memory and how modules are loaded? Maybe take a look at how primitive 571 is implemented. (`SmalltalkImage >> #unloadModule:`)
Hi win32 folks,
coincidentally I found a huge bug with external primitives on win32 Spur VMs. The bug was the VM code generator not using the EXPORT macro to export primitive accessor depths. This macro does nothing on unix & macos (hence I didn't notice the bug), but is required on win32. The transparent forwarding mechanism for primitives is broken by this bug. Hence any external primitive that needs to retry to follow a forwarder (e.g. after a pin operation or a become to grow something, etc) won't retry on win32. I fon't know if this will fix the SSL issue on win32. But it is certainly a smoking gun.
The bug is fixed in my most recent commit.
On Fri, Apr 9, 2021 at 12:45 AM Marcel Taeumel ***@***.***> wrote:
Well, I wouldn't immediately celebrate "Got it!" 😄 but yes, I can reproduce this in Squeak 5.3 (64-bit) with the VM 202003021730. So, what does this behavior tell us? At least, it is releated to some kind of memory management. Renaming "Squeak.exe" to something else or using "SqueakConsole.exe" or changing the VM to 202011120327 ... all options still show that bug in that "broken" image.
There is something wrong with module loading. Is there some interference between Squeak's object memory and how modules are loaded? Maybe take a look at how primitive 571 is implemented. (SmalltalkImage >> #unloadModule:)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/554#issuecomment-816486966, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADY5VUE6HYN67FMHCWUIMQTTH2WA5ANCNFSM4ZCQMSXA .
Sounds reasonable!
Well ... if it helps ... try to avoid the file-chooser when starting the OSVM under Windows. Either specify the image path on the command line or do drag-and-drop or just have a single image in your VM path. There is something fishy going on there. Last time, I had repeated issues with FFI that I could repeatedly resolve with directly opening the image file. This is very strange! (Maybe it has something to do with those dialogs being obsolete in Windows 10? See: - https://docs.microsoft.com/en-us/windows/win32/dlgbox/using-common-dialog-bo... - https://docs.microsoft.com/en-us/windows/win32/shell/common-file-dialog
Note that I cannot reproduce the SSL bug as described by Christoph above in recent VMs. Seems that Eliot fix worked. :-)
Hi all!
I just pulled the latest changes from OSVM and was still able to reproduce the steps from above. I'm sorry to say that the issue is not yet resolved for me.
Just retried it again multiple times with both VM versions and it only failed in the old VM. Seems to be fixed, thank you!
Closed #554.
vm-dev@lists.squeakfoundation.org