[Vm-dev] OSVM on WebAssembly

Manuel Leuenberger maenuleu at gmail.com
Wed Aug 31 15:32:48 UTC 2022


Hi Eliot,

I can run Squeak with UI on WebAssembly now ;).

> On 20 Aug 2022, at 00:06, Eliot Miranda <eliot.miranda at gmail.com> wrote:
> 
> Hi Manuel,
> 
>> On Aug 17, 2022, at 5:12 PM, Manuel Leuenberger <maenuleu at gmail.com> wrote:
>> 
>> Hi Eliot,
>> 
>> It's alive!
> 
> Great progress! Good to hear.
> 
>> It also refusing to die, but I'll just ignore that for now.
>> 
>> Currently, I get the image through startup and run the eval handler, no UI yet.
>> Code lives at https://github.com/maenu/opensmalltalk-vm/tree/wasm-1 <https://github.com/maenu/opensmalltalk-vm/tree/wasm-1>.
>> Lacks documentation and includes paths on my machine.
>> 
>> 
>> 
>> That was one tough cookie, but I finally found the reason for failing #new message: I noticed that #new was the first non-quick primitive, so I looked at the primitiveFunctionPointer. That one seemed to be fine, I could call the #new primitive as a function pointer. Then it only took me a small eternity to figure out that not machinery around the function pointer is broken, but the value of the pointer is the problem, leading to an ambiguity. WASM addresses start at 0, #primitiveNew got an int value of 220. This confused the interpreter, as it was looking at it as a quick primitive (value < 500 something), then even more confusing it as the value was below 256. That was the assertion that triggered. As I could not see an easier way to make function pointers and quick primitives distinguishable, I inserted a padding of 700 hundred useless methods before the primitives. It sure ain't pretty, but to my surprise it does the job. From then on it was just a good ol' whac'a'flag.
> 
> Interesting. Assumptions about the implementation substrate and/or underlying platform are always present in any high-performance VM.  I hope this assumption was sufficiently documented, but it looks like not.  I like your solution.  However, if primitiveFunctionPointer is signed on the WASM platform I would consider introducing a alternative, that being using negative numbers for the quick primitives.

I could not find this being documented. But I expected the new platform to falsify some axioms, it was just hard to figure out what the axiom was in this case. Using negative numbers for quick primitives seems very elegant to me, have not tried it though.

> 
> As your work progresses you’re likely going to face more platform-specific issues, many of which will be within cCode:[inSmalltalk:] sends.  You may need to add cCode:wasmCode:[inSmalltalk:] or some such.  Please don’t be afraid to add such a thing.
> 
>> 
>> There is still a lot of stuff to do. E.g., external plugins break due to WASM being really picky about casting function pointers; Emscripten supports SDL/GL, so a headful WASM-VM seems also possible.
> 
> Super cool.  Keep us posted.  And I’m on Discord and/or Virtend if you need to talk/show some live state, etc.
> 

I put function pointer issues aside for (means no external plugins/dlopen) and and gave SDL a shot. And it does work!

I can run tests


and write and run some code


It is pretty clunky (input delay eyeballing 0.1s), needs some profiling to see where the bottleneck is. The main change here was nanosleep to emscripten_sleep in aioSleepForUsecs, which yields to the browser instead of doing a synchronous wait when idle.

I guess the next step is a webhosted demo, could even include the VM sources to debug in the browser (pretty much my local setup).
Then getting rid of the EMULATE_FUNCTION_POINTER_CASTS flag which makes primitiveFunctionPointer work, but denies dlopen. I guess I have to generate/write an adapter function per plugin (https://emscripten.org/docs/porting/guidelines/function_pointer_issues.html#working-around-function-pointer-issues <https://emscripten.org/docs/porting/guidelines/function_pointer_issues.html#working-around-function-pointer-issues>). I would appreciate a hint, if somebody reading this can make sense of it.

>> 
>> Cheers,
>> Manuel
>> 
>>> On 9 Jul 2022, at 02:04, Eliot Miranda <eliot.miranda at gmail.com <mailto:eliot.miranda at gmail.com>> wrote:
>>> 
>>> Hi Manuel,
>>> 
>>> On Fri, Jul 8, 2022 at 9:46 AM Manuel Leuenberger <maenuleu at gmail.com <mailto:maenuleu at gmail.com>> wrote:
>>>  
>>> Hi Eliot,
>>> 
>>> Thanks for the explanation. I started looking into the return bytecode 348, but I could not find something suspicious. So I started logging more and then I saw a dim light:
>>>  
>>>  [snip]
>>>  
>>> Once it reaches #doesNotUnderstand:, we are in an infinite loop.
>>> 
>>>> currentBytecode = 501
>>>> currentBytecode = 339
>>>> currentBytecode = 320
>>>> currentBytecode = 401
>>>> currentBytecode = 272
>>>> currentBytecode = 380
>>>> currentBytecode = 332
>>>> currentBytecode = 384
>>>> (localPrimIndex > 0xFF) && (localPrimIndex < 520) 5814
>>>> localPrimIndex = 253
>>>> currentBytecode = 385
>>>> localPrimIndex = 256
>>>> currentBytecode = 348
>>>> Smalltalk stack dump:
>>>>   0xaa2f6c MessageNotUnderstood class(Behavior)>new 0x1728360: a(n) MessageNotUnderstood
>>>>   0xaa2f90 SmallInteger(Object)>doesNotUnderstand: message: 0xfffffff1=-8
>>>>   0xaa2fbc SmallInteger(Object)>doesNotUnderstand: message: 0xfffffff1=-8
>>>>   0xaa2fe8 SmallInteger(Object)>doesNotUnderstand: manager: 0xfffffff1=-8
>>>>   0xaa300c SessionManager>newSession 0x177d300: a(n) SessionManager
>>>>   0xaa302c SessionManager>installNewSession 0x177d300: a(n) SessionManager
>>>>   0xaa3050 SessionManager>launchSnapshot:andQuit: 0x177d300: a(n) SessionManager
>>>>  0x43276f8 s [] in SessionManager>snapshot:andQuit:
>>>>  0x43278a0 s [] in FullBlockClosure(BlockClosure)>newProcess
>>> 
>>> My suspicion is a 32bit/64bit type issue. I also saw a few segfaults in the past, but not recently, and not reproducible.
>>> Anybody has a clue on what I should look at next?
>>> 
>>> Constructing a debugger environment where you can breakpoint and step through code.  Unless you have a WASM simulator this means working within whatever development environment is provided for WASM.  You want t6o answer the question: why is the message not understood?  To do that you need to answer the questions: what is the selector? What is the argument count? What is the receiver? etc...
>>> 
>>> 
>>> 
>>> Cheers,
>>> Manuel
>>> 
>>>> On 27 Jun 2022, at 23:31, Eliot Miranda <eliot.miranda at gmail.com <mailto:eliot.miranda at gmail.com>> wrote:
>>>> 
>>>> Hi Manuel,
>>>> 
>>>>    cool beans!
>>>> 
>>>> On Mon, Jun 27, 2022 at 3:54 AM Manuel Leuenberger <maenuleu at gmail.com <mailto:maenuleu at gmail.com>> wrote:
>>>>  
>>>> Hi,
>>>> 
>>>> Ever since WebAssembly became a thing, I was wondering if this could become a target for VMs. People are already compiling FFMPEG and other complex tools. So I thought I would try as well.
>>>> 
>>>> So here I am to report to whom it may concern: OSVM compiles to WebAssembly, starts up (nearly), then looping infinitely
>>>> Meaning: The VM mmaps the image file, loads plugins (SecurityPlugin made EXTERNAL), starts interpreter loop, but then loops the same bytecode sequence forever
>>>> 
>>>> Code lives at https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm <https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm> if you want to try it out.
>>>> 
>>>> Below is the current Readme, including a short list of issues. Maybe some of you could give me a hint?
>>>> 
>>>> Cheers,
>>>> Manuel
>>>> 
>>>> pharo.stack.spur.wasm
>>>> 
>>>> Compiles OSVM Stack interpreter to WebAssembly using the Emscripten compiler. Emscripten can be used as a drop-in replacement for gcc/clang and cmake. Based on MinHeadless Linux 32bit sources, as Emscripten provides Linux-like environment (pthreads, nanosleep, dlopen, file system). Check the latest few commits of maenu to see changed files.
>>>> 
>>>>  <https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm#current-issues>Current issues
>>>> 
>>>> Most adjustments are just putting EMSCRIPTEN in a macro or script. Should be fine, but should be tested to not interfere with other builds.
>>>> 
>>>> Compiles and runs, but seems to be stuck in initial GC and Heartbeat. Those could be related to incorrect get/set64() implementation.
>>>> 
>>>> Removed mmap address hint, as it caused errors.
>>>> 
>>>> Using argv eval '1 + 3' to do a simple eval does not terminate.
>>>> 
>>>> Interpreter repeats these bytecodes forever (what is this?):
>>>> 
>>>> 
>>>> Taking these from e.g. src/spur64.stack/interp.c they are
>>>>  
>>>>        CASE(112)
>>>>         CASE(332) /*76*/
>>>>             /* pushReceiverBytecode */
>>>> 332
>>>>         CASE(208)
>>>>         CASE(209)
>>>>         CASE(210)
>>>>         CASE(211)
>>>>         CASE(212)
>>>>         CASE(213)
>>>>         CASE(214)
>>>>         CASE(215)
>>>>         CASE(216)
>>>>         CASE(217)
>>>>         CASE(218)
>>>>         CASE(219)
>>>>         CASE(220)
>>>>         CASE(221)
>>>>         CASE(222)
>>>>         CASE(223)
>>>>         CASE(384) /*128*/ i.e. send literal selector 0 with 0 args
>>>>         CASE(385) /*129*/
>>>>         CASE(386) /*130*/
>>>>         CASE(387) /*131*/
>>>>         CASE(388) /*132*/
>>>>         CASE(389) /*133*/
>>>>         CASE(390) /*134*/
>>>>         CASE(391) /*135*/
>>>>         CASE(392) /*136*/
>>>>         CASE(393) /*137*/
>>>>         CASE(394) /*138*/
>>>>         CASE(395) /*139*/
>>>>         CASE(396) /*140*/
>>>>         CASE(397) /*141*/
>>>>         CASE(398) /*142*/
>>>>         CASE(399) /*143*/
>>>>             /* sendLiteralSelector0ArgsBytecode */
>>>>  
>>>> 384
>>>>  
>>>> hence send literal selector 1 with 0 args
>>>> 
>>>> 385
>>>> 
>>>>         CASE(124)
>>>>         CASE(348) /*92*/
>>>>             /* returnTopFromMethod */
>>>> 348
>>>>  
>>>>         CASE(501) /*245*/
>>>>             /* longStoreTemporaryVariableBytecode */
>>>> 501
>>>>         CASE(136)
>>>>         CASE(339) /*83*/
>>>>             /* duplicateTopBytecode */
>>>> 339
>>>>         CASE(16)
>>>>         CASE(320) /*64*/
>>>>             /* pushTemporaryVariableBytecode */
>>>> 320
>>>>  
>>>>         CASE(224)
>>>>         CASE(225)
>>>>         CASE(226)
>>>>         CASE(227)
>>>>         CASE(228)
>>>>         CASE(229)
>>>>         CASE(230)
>>>>         CASE(231)
>>>>         CASE(232)
>>>>         CASE(233)
>>>>         CASE(234)
>>>>         CASE(235)
>>>>         CASE(236)
>>>>         CASE(237)
>>>>         CASE(238)
>>>>         CASE(239)
>>>>         CASE(400) /*144*/
>>>>         CASE(401) /*145*/ i.e. send literal selector 1 with 1 arg
>>>>         CASE(402) /*146*/
>>>>         CASE(403) /*147*/
>>>>         CASE(404) /*148*/
>>>>         CASE(405) /*149*/
>>>>         CASE(406) /*150*/
>>>>         CASE(407) /*151*/
>>>>         CASE(408) /*152*/
>>>>         CASE(409) /*153*/
>>>>         CASE(410) /*154*/
>>>>         CASE(411) /*155*/
>>>>         CASE(412) /*156*/
>>>>         CASE(413) /*157*/
>>>>         CASE(414) /*158*/
>>>>         CASE(415) /*159*/
>>>>             /* sendLiteralSelector1ArgBytecode */
>>>> 401
>>>>         CASE(64)
>>>>         CASE(272) /*16*/
>>>>             /* pushLiteralVariableBytecode */
>>>> 272
>>>>          CASE(204)
>>>>         CASE(380) /*124*/
>>>>             /* bytecodePrimNew */ i.e. a send of #new from the special selector bytecode
>>>> 380
>>>> 
>>>> If I had to guess what's going wrong I'd guess that the return bytecode 348 isn't correctly implemented.
>>>>  
>>>> 
>>>>  <https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm#build--run>Build & Run
>>>> 
>>>>  <https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm#1-install-emscripten>1. Install Emscripten
>>>> 
>>>> I installed Emscripten SDK <https://emscripten.org/docs/getting_started/downloads.html> to get an all-in-one package.
>>>> 
>>>>  <https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm#2-grab-an-image>2. Grab an image
>>>> 
>>>> Grab a 32bit Smalltalk image and but it in the image folder. I used Pharo 9.
>>>> 
>>>> cd building/minheadless.cmake/x86/pharo.stack.spur.wasm
>>>> mkdir image
>>>> cd image
>>>> curl https://get.pharo.org/32/90 <https://get.pharo.org/32/90> | bash
>>>>  <https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm#3-build-vm>3. Build VM
>>>> 
>>>> ./mvm_configure_variant debug Debug && make -C debug install
>>>> 
>>>>  <https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm#4-run-a-web-server>4. Run a web server
>>>> 
>>>> emrun --port 9090 --serve_root ../../../../ --no_browser .
>>>> 
>>>>  <https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm#5-launch-vm>5. Launch VM
>>>> 
>>>> http://localhost:9090/building/minheadless.cmake/x86/pharo.stack.spur.wasm/debug/dist/squeak.html <http://localhost:9090/building/minheadless.cmake/x86/pharo.stack.spur.wasm/debug/dist/squeak.html>
>>>>  <https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm#6-inspect-running-vm>6. Inspect running VM
>>>> 
>>>> The VM is compiled with DWARF debug information, which is understood by the Chrome debugger. So we can step through the C sources of the WebAssembly, pretty nifty.
>>>> 
>>>>  <https://github.com/maenu/opensmalltalk-vm/tree/Cog/building/minheadless.cmake/x86/pharo.stack.spur.wasm#resources>Resources
>>>> 
>>>> Inspect WebAssembly (at the bottom) <https://webassembly.org/getting-started/developers-guide/>
>>>> Emscripten doc (Porting) <https://emscripten.org/docs/porting/index.html>
>>>> Emscripten settings <https://emsettings.surma.technology/>
> Eliot
> _,,,^..^,,,_ (phone)


Cheers,
Manuel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20220831/eaa678d9/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2022-08-17 at 19.17.19.png
Type: image/png
Size: 169299 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20220831/eaa678d9/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2022-08-31 at 16.14.01.png
Type: image/png
Size: 187547 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20220831/eaa678d9/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2022-08-31 at 16.46.06.png
Type: image/png
Size: 172911 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20220831/eaa678d9/attachment-0005.png>


More information about the Vm-dev mailing list