[Vm-dev] Bochs simulator plug-in structure

Lars lars.wassermann at googlemail.com
Sat May 12 09:09:38 UTC 2012

Hi all,

I am Lars, one of this years GSoC students, and my project is 
implementing an ARM Simulator plugin and subsequently an ARM JIT for the 
Cog VM[1].

Last week, I had a meeting with my mentors, Eliot and Stefan, and I will 
try to summarize my understanding of the Cog VM structure here. The main 
topic of our chat was the overall structure of the project and a 
timeline, and the functions of the Bochs simulator for the CogVM x86 
JIT. This summary is based on our protocols and my understanding and 
might not be correct.

The Bochs x86 simulator[2] is a CogVM plugin writen in C to interface 
with the Bochs x86-64 C++ simulator[3]. The Bochs simulator can emulate 
an Intel x86 CPU and i.a. common I/O devices, a BIOS, and different 
instruction extensions like MMX. Of all that, only disassembling, single 
and multiple instruction execution and error decoding is used.
The Bochs plugin keeps track of the ByteArray into which code has been 
compiled. It has to do so, because the Smalltalk VM garbage collector 
may move it around in memory. The plugin implements several (5?) 
primitives, which are:

single instruction execution on given memory in form of a ByteArray;
using the Bochs' disassembler, in order to compare the supposed meaning 
of some bytes from a ByteArray with its interpretation by an emulated CPU;
execution of code from the byte array until an interrupt or error. Note 
that the CogVM uses a heartbeat (~once a second?) (for GC?) which is an 
interrupt and thus the simulator will not run longer than between two 
accessing the error codes which are provided by the Bochs;
accessing the register state of the simulated CPU?

The ByteArray to provide for the simulator contains an image: Objects, a 
Stack, and Instructions. But whenever the Bochs code comes to 
instructions where it would have to interpret Smalltalk code, the 
instructions are generated in such a way, that it accesses an illegal 
memory address and thus stops, back to the Smalltalk VM, which then 
interprets the Smalltalk Code and afterwards continues the simulation.
So the plugin also maintains these false addresses.

The simulator is used to have stepwise debugging and automated tests 
before coping with a real CPU, where debugging would be much more tedious.

There has not been a definitive decision on which ARM simulator to use, 
but according to a previous try[mail was directly to Eliot], most open 
source ARM simulators are not worth using. I.a., QEMU is not viable[4]. 
So far, a good choice seems to be the GNU ARMulator[5]. Another one 
mentioned in our chat was SkyEye[6].

In order to use/test the simulator, an ARM compiler is needed. The CogVM 
includes an abstract instruction set as one of two basic intermediate 
languages. These instruction are for an abstract 2 register machine 
(because x86 is a 2 register machine) and are implemented in subclasses 
of CogAbstractInstruction for actual IAs. Instances of 
CogAbstractInstruction are single instructions for that abstract 
machine. The x86 subclass is CogIA32Instructions. ARM actually is a 3 
register machine.

Those abstract instruction objects are created by an instance of the 
Cogit and should provide assembler which then populates a ByteArray.

This was an abstract overview of some two key parts which are needed for 
JITting. We meet again next week and Eliot plans to show me around the 
code, which implements the above described functionalities.

At the moment, my semester is still running, so there might not be much 
progress. But in the beginning of June, most of the lectures will be done.
All the best,

[1] http://gsoc2012.esug.org/projects/arm-jitter
[3] http://bochs.sourceforge.net/
[5] http://en.wikipedia.org/wiki/ARMulator
[6] http://sourceforge.net/projects/skyeye/?source=directory

More information about the Vm-dev mailing list