[squeak-dev] [ANN] CorruptVM preview

Tue Jul 1 19:44:42 UTC 2008

Hi Igor,
    this looks cool.  It is related to David Ungar's Klein which was an
attempt at a self-hosted Self system, and to Exupery and Typed Smalltalk,
and Ian's Cola.  Whenever I've thought about this style VM I've always been
put off by a bug issue.  How are you going to deal with hard crashes?

One needs some form of symbolic debugging at the machine code level.  If one
is debugging the Squeak VM (or any other Smalltalk VM I've worked on) one
can compile a version with debug symbols, use the platform's debugger (e.g.
gdb) and write debugging functions in C to be called from that debugger.

If one has a self-hosted Smalltalk system with no symbolic information that
can be read by a platform's debugger because the system, being Smalltalk,
has is own fully reflective self-description, then it seems to me one really
is fishing about in a vast hex dump of the entire system, and that doesn't
seem workable.  Note that in the presence of a hard crash one doesnt have
the system to debug itselr because it has just crashed.

So are you going to export symbolic information that a platform debugger can
consume (and if so, how?) or are you going to do something else (e.g.
mirrors)?

On Mon, Jun 30, 2008 at 11:06 PM, Igor Stasenko <siguctua at gmail.com> wrote:

> Since lately there a some interest in C/C++ compiling and some people
> mentioned how it would be cool to make everything to be dynamically
> compiled,
> i decided to make a preview announce of a research which i done during
> last few months.
>
> A project name is a weird, and definitely it require more appropriate
> name to not scare off a potential users/developers, but this is not an
> issue right now :)
>
> Let me describe a little the key features of project and what goals it
> pursuing:
>
> - the main goal is to create a smalltalk language environment (similar
> to other smalltalks), but i avoid to call it VM, because its not
> really a VM, because there is no VM at all.
> - everything is written in smalltalk
> - system is completely self sustaining: smalltalk code compiled down
> to native code (no initial need in having bytecodes). Of course no-one
> prevents you from implementing a bytecode interpreter on top of it.
> But this is beyond the scope of current project. :)
> - there is no primitives nor need in writing external code in C (or in
> any other statically typed language). A primitives replaced by native
> methods (methods with <native> pragma), by using which you can
> implement any low-level behavior.
>
> - everything (by a 99.9% ;)  in system is up to implementor. There is
> a few 'glue' semantics used by compiler, but compiler itself
> extensively using static inlining (inlining native methods from
> well-known classes such as CompiledMethod/ProtoObject or
> StackContext). Memory management/relocation, FFI , a diverse set of
> what we currently know as 'privitives' will be implemented in a
> system. This opens a potentially huge playground, how system would
> look like :)
>
> - avoid using global state. All state which code can potentially refer
> to is placed in literals. There is no difference between native
> methods and smalltalk methods in compiled method format. The
> difference only how they are compiled. Of course there will be some
> global state , i think it would be a single 'lobby' object, which
> contains a symbols table (required to support symbols uniqueness
> thoughout all system). But anyway, references to it will be possible
> only from method literals.
> - generated native code are location independent. Since all jumps will
> be relative, and all location-dependent stuff are either held in
> literals or computed. Therefore a CompiledMethod instances can be
> relocated freely in memory (by GC and friends) without any change that
> it will cause any harm.
>
> - compiler translates smalltalk code to a lambda representation. Then
> using different transformations it generates a low-level lambdas,
> which represent a virtual machine CPU instructions. No AST nor bunch
> of different classes to represent semantic elements of code used.
> Lambdas all the way down.
>
> - an object memory model is initially based on Ian's minimal object
> system. With some changes.
>
> You can download a snapshot of project at squeaksource:
> http://www.squeaksource.com/CorruptVM
>
> What is currently should work:
>
> CVMachineSimulator bootstrap   -- bootstrap a object memory for simulation
> CVSimulationTests run -- run different tests on boostrapped object memory
>
> There are also an initial implementation of translating to native code
> using Exupery (you need to load Exupery for that).
> Do it:
> CVExuperyCompiler test inspect
>
>
> I am currently open for suggestions and advices or discussion in how
> is better to implement system based on such design.
> Would be glad to read your comments.
>
> There is also a wiki page of project:
> http://wiki.squeak.org/squeak/6041
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20080701/20a742ce/attachment.htm