[Vm-dev] Slang for Cuis (WIP, experimental)

Tue May 26 21:30:29 UTC 2020

Paul, David, Eliot, everyone:

> And the latest versions (e.g. March 2013) of PetitParser & 
> PetitParserTests
> on Lukas Rengglis site work too

This is awesome news.

When I last tried this (some time last year), it was failing, throwing 
me into a very depressed mood.  Not that any of these issues are so bad 
or difficult in themselves, but when there is more work that needs to be 
done on the processor-modeling engine before it can really take off, 
than can fit in 24 hours/day, then looking at additional 100 small 
issues leads me to feel like, "ok forget it, there just isn't enough 
manpower".

I am happy to learn that I am wrong about that.

Ok.

Let me give you a quick overview.

I just committed a new VMMaker.  Compared to the Sunday revision, this 
one runs not only in gem5, but also on the native desktop CPU without 
any CPU-emulation involved.  So at least there is some point of 
integration that's working.  It has, of course, obvious defects; for 
example, it is very slow.  Part of it is because it's doing 
obviously-stupid things, e.g. only processing RSP S-packet when T-packet 
is available, indiscriminately asking for g- and G-transfers when 
neither is needed, doing unnecessary M-transfers, etc etc etc.  A much 
larger reason is the frequency of semihosting requests (i.e. the 
segfaults on access to fake addresses; or what Modtalk calls "surgery 
points").  On the out-of-the-box SimpleAtomic gem5 model of x86, getting 
SpurReader to the first prompt takes 8935941 guest instructions in 
15743835 clock cycles (7871917500ps simulated time) and 311777 
segfaults: about 1 segfault per every 28 instructions.  One interesting 
feature of "Target-Agnostic Modtalk" was the flexibility to configure, 
for each runtime subroutine, on which side of the segfault fence it will 
execute.  In other words, if I have, say, all the lookup and send 
machinery production-ready, but I still need to write primitiveNew in 
simulation in the debugger, I can configure send/lookup code to execute 
natively -- i.e. with no segfaults involved -- but a jump to 
primitiveNew will segfault and get me into the Smalltalk debugger.

But there is bigger fish to fry.

This "RSP client" is a small part of "the ULD debugger" which in turn is 
part of "the target-agnostic verified engine".  None of their operation 
involves the human programmer's expression of knowledge of the ISA: the 
system reals a formal specification of the processor, and the behavior 
of the tools is extracted from the proof.

As it stands in revision 2747 today, the ULD inside Squeak is not 
connected to the formal model (which is just another Smalltalk object).  
To temporarily be able to make progress in combining with Cog/VMMaker, I 
use a mockup object representing a minimal set of facts about the x86 
ISA just enough to allow execution over RSP -- but not any of the 
"really interesting" stuff such as symbolic execution (or even much 
simpler things like assembly/disassembly!)  As I mentioned in my 
previous message, I got impeded by a few barriers to bringing the rest 
of the system over to Squeak; when we overcome those, the simplest next 
low-hanging fruit will be to repeat the same test runs on various ISAs 
without involving any mock-up objects, and exercising all the services 
VMMaker already asks from the Alien plugins (e.g., disassembly).  Only 
slightly higher-hanging is notating the Cog Compiler in the retargetable 
assembler even before we bring in the actual superoptimizer.  Even with 
manual instruction selection, the new assembler is able to propagate 
"holes" to the binary instruction stream.  When this 
instruction-memory-with-holes is executed by the symbolic engine, the 
final state(s) is a term over the unknown variables, and in this way 
proof of program correctness in this case is possible even without 
involving superoptimization.

But let's return to Earth.

> Not sure about the Socket/SUnit
>
Here is my problem with SUnit:  let's say I have some code acting as a 
client on a TCP socket.  It connects to the TCP server, sends a few 
bytes, and receives a response.  When I launch this as a DoIt, it always 
works.  When I put it inside a TestCase, SUnit somehow interferes with 
the socket's receive semaphore, so the semaphore randomly (not always!) 
times out even when data is *already* available.  The exactly same tests 
do not exhibit the problem on either Cuis or Pharo.  Also, when on 
Squeak SUnit is not involved, that socket code never fails even for 
extremely complex exchanges on the network (such as running the whole 
Cog VM over RSP).  But it would be very difficult to continue bringing 
over the other components without working SUnit (esp. considering how 
many tests are already there).