Hi Igor,<div><br></div><div>&nbsp;&nbsp; &nbsp;this looks cool. &nbsp;It is related to David Ungar&#39;s Klein which was an attempt at a self-hosted Self system, and to Exupery and Typed Smalltalk, and Ian&#39;s Cola. &nbsp;Whenever I&#39;ve thought about this style VM I&#39;ve always been put off by a bug issue. &nbsp;How are you going to deal with hard crashes?</div>

<div><br></div><div>One needs some form of symbolic debugging at the machine code level. &nbsp;If one is debugging the Squeak VM (or any other Smalltalk VM I&#39;ve worked on) one can compile a version with debug symbols, use the platform&#39;s debugger (e.g. gdb) and write debugging functions in C to be called from that debugger.</div>

<div><br></div><div>If one has a self-hosted Smalltalk system with no symbolic information that can be read by a platform&#39;s debugger because the system, being Smalltalk, has is own fully reflective self-description, then it seems to me one really is fishing about in a vast hex dump of the entire system, and that doesn&#39;t seem workable. &nbsp;Note that in the presence of a hard crash one doesnt have the system to debug itselr because it has just crashed.</div>

<div><br></div><div>So are you going to export symbolic information that a platform debugger can consume (and if so, how?) or are you going to do something else (e.g. mirrors)?<br><br><div class="gmail_quote">On Mon, Jun 30, 2008 at 11:06 PM, Igor Stasenko &lt;<a href="mailto:siguctua@gmail.com">siguctua@gmail.com</a>&gt; wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Since lately there a some interest in C/C++ compiling and some people<br>

mentioned how it would be cool to make everything to be dynamically<br>

compiled,<br>

i decided to make a preview announce of a research which i done during<br>

last few months.<br>

<br>

A project name is a weird, and definitely it require more appropriate<br>

name to not scare off a potential users/developers, but this is not an<br>

issue right now :)<br>

<br>

Let me describe a little the key features of project and what goals it pursuing:<br>

<br>

- the main goal is to create a smalltalk language environment (similar<br>

to other smalltalks), but i avoid to call it VM, because its not<br>

really a VM, because there is no VM at all.<br>

- everything is written in smalltalk<br>

- system is completely self sustaining: smalltalk code compiled down<br>

to native code (no initial need in having bytecodes). Of course no-one<br>

prevents you from implementing a bytecode interpreter on top of it.<br>

But this is beyond the scope of current project. :)<br>

- there is no primitives nor need in writing external code in C (or in<br>

any other statically typed language). A primitives replaced by native<br>

methods (methods with &lt;native&gt; pragma), by using which you can<br>

implement any low-level behavior.<br>

<br>

- everything (by a 99.9% ;) &nbsp;in system is up to implementor. There is<br>

a few &#39;glue&#39; semantics used by compiler, but compiler itself<br>

extensively using static inlining (inlining native methods from<br>

well-known classes such as CompiledMethod/ProtoObject or<br>

StackContext). Memory management/relocation, FFI , a diverse set of<br>

what we currently know as &#39;privitives&#39; will be implemented in a<br>

system. This opens a potentially huge playground, how system would<br>

look like :)<br>

<br>

- avoid using global state. All state which code can potentially refer<br>

to is placed in literals. There is no difference between native<br>

methods and smalltalk methods in compiled method format. The<br>

difference only how they are compiled. Of course there will be some<br>

global state , i think it would be a single &#39;lobby&#39; object, which<br>

contains a symbols table (required to support symbols uniqueness<br>

thoughout all system). But anyway, references to it will be possible<br>

only from method literals.<br>

- generated native code are location independent. Since all jumps will<br>

be relative, and all location-dependent stuff are either held in<br>

literals or computed. Therefore a CompiledMethod instances can be<br>

relocated freely in memory (by GC and friends) without any change that<br>

it will cause any harm.<br>

<br>

- compiler translates smalltalk code to a lambda representation. Then<br>

using different transformations it generates a low-level lambdas,<br>

which represent a virtual machine CPU instructions. No AST nor bunch<br>

of different classes to represent semantic elements of code used.<br>

Lambdas all the way down.<br>

<br>

- an object memory model is initially based on Ian&#39;s minimal object<br>

system. With some changes.<br>

<br>

You can download a snapshot of project at squeaksource:<br>

<a href="http://www.squeaksource.com/CorruptVM" target="_blank">http://www.squeaksource.com/CorruptVM</a><br>

<br>

What is currently should work:<br>

<br>

CVMachineSimulator bootstrap &nbsp; -- bootstrap a object memory for simulation<br>

CVSimulationTests run -- run different tests on boostrapped object memory<br>

<br>

There are also an initial implementation of translating to native code<br>

using Exupery (you need to load Exupery for that).<br>

Do it:<br>

CVExuperyCompiler test inspect<br>

<br>

<br>

I am currently open for suggestions and advices or discussion in how<br>

is better to implement system based on such design.<br>

Would be glad to read your comments.<br>

<br>

There is also a wiki page of project:<br>

<a href="http://wiki.squeak.org/squeak/6041" target="_blank">http://wiki.squeak.org/squeak/6041</a><br>

<font color="#888888"><br>

--<br>

Best regards,<br>

Igor Stasenko AKA sig.<br>

<br>

</font></blockquote></div><br></div>