<div dir="ltr"><div>Here is an idea that might increase modularity in the compiler and VM</div>
<div>and might encourage multiple compilers and VMs:</div>
<div> </div>
<div>Instead of having the Compiler dump byte codes directly into a ByteArray</div>
<div>you have a SmalltalkByteCodeStream which displays all the different</div>
<div>functionalities of the VM in a systematic way.</div>
<div> </div>
<div> ***********************</div>
<div>Each byte code instruction that the VM recognizes would be expressed</div>
<div>as a SmalltalkByteCodeStream method that encodes that instruction.</div>
<div> </div>
<div> ***********************</div>
<div> </div>
<div>In this way the functionality of the VM becomes self documenting.</div>
<div> </div>
<div>In order to find out the public functionality of the VM you would look</div>
<div>at the public methods in the SmalltalkByteCodeStream. These</div>
<div>methods should be well documented with comments after the body</div>
<div>of the methods.</div>
<div> </div>
<div>The byte code instructions that the VM recognizes become a well</div>
<div>documented public set of instructions that compiler makers can</div>
<div>make their compilers on. These instructions become the VM's</div>
<div>public language.</div>
<div> </div>
<div>Each VM would have its own SmalltalkByteCodeStream so you</div>
<div>might name them SmalltalkByteCodeStreamForVM1 and</div>
<div>SmalltalkByteCodeStreamForVM2 etc.</div>
<div> </div>
<div>So like if you were</div>
<div>encoding an ifTrue:ifFalse: expression in a naive way you could do</div>
<div>it like:</div>
<div> </div>
<div>aSmalltalkByteCodeStreamForVM1</div>
<div> nextPutAll: aBooleanExpression ;</div>
<div> nextIfStackTopTrueSkipNextIfStackTopFalse ;</div>
<div> skipNext:( SmalltalkByteCodeStream jumpSize ) ;</div>
<div> yourselfDo:[ :sbcs | jumpToFalseMarker := sbcs makeSpaceForJump ] ;<br> nextPutAll: ifTrueBranchExpression ;</div>
<div> yourselfDo:[ :sbcs | jumpToExitMarker := sbcs makeSpaceForJump ] ;</div>
<div> yourselfDo:[ :sbcs | jumpToFalseMarker fillInCodeFor: sbcs position ] ;</div>
<div> nextPutAll: ifFalseBranchExpression</div>
<div> yourselfDo:[ :sbcs | jumpToExitMarker fillInCodeFor: sbcs position ]</div>
<div> </div>
<div>Object>>yourselfDo: aBlock</div>
<div> aBlock value: self . ^self .</div>
<div> </div>
<div>SmalltalkByteCodeStream>>nextPutAll: aByteCodeGenerator</div>
<div> aByteCodeGenerator generateByteCodesOn: self</div>
<div> </div>
<div>SmalltalkByteCodeStream>>makeSpaceForJump</div>
<div> ^( JumpInstructionMarker new</div>
<div> position: self position</div>
<div> on: self ; yourself</div>
<div> ) yourselfDo:[ :na | </div>
<div> self position:( </div>
<div> ( self position ) + </div>
<div> ( SmalltalkByteCodeStream jumpSize ) ) ]</div>
<div> </div>
<div>JumpInstructionMarker>>fillInCodeFor: expressionPosition</div>
<div> | oldPosition |</div>
<div> oldPosition := stream position .</div>
<div> stream position: jumpInstructionPosition ;</div>
<div> nextPutJumpByteCode ;</div>
<div> intoNext: ( SmalltalkByteCodeStream jumpSize ) -</div>
<div> ( SmalltalkByteCodeStream jumpByteCodeSize ) </div>
<div> putInteger: expressionPosition ;</div>
<div> position: oldPosition</div>
<div> "<---( stream is a WriteStream on a ByteArray or something )"<br></div>
<div class="gmail_quote">On Mon, Sep 1, 2008 at 8:14 PM, David Zmick <span dir="ltr"><<a href="mailto:dz0004455@gmail.com">dz0004455@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div dir="ltr">So, here is an idea, start the VM from scratch, and, redo the entire project to allow what we want in Squeak, and the compiler. I know that is a really crazy idea, but I think it could be possible. I have been thinking about a couple of very unlikely, but, possible, maybe, VM ideas, but, what do you guys think about that?
<div>
<div></div>
<div class="Wj3C7c"></div></div></div></blockquote>
<div><br> </div>
<div>I think that the new VM2 should be right next to the old VM1</div>
<div>in the running image. So you could use the old VM1 to make the</div>
<div>new VM2.</div>
<div> </div>
<div>Each VM can have multiple name spaces. Each VM generates</div>
<div>a VM space with multiple name spaces inside of it. Each name</div>
<div>space could have its own Object Class and Class hierarchy.</div>
<div>Or you could have a hierarchy being run by VM1 and switch it</div>
<div>over to be run by VM2. And back and forth at runtime.</div>
<div> </div>
<div>Perhaps the CPU could be thought of as a big VM and then the</div>
<div>SmalltalkByteCodeStreamForCPU would generate machine code</div>
<div>into the ByteArray that the SmalltalkByteCodeStreamForCPU</div>
<div>was on. And that ByteArray would be stuck into the</div>
<div>CompiledMethodForCPU.</div>
<div> </div>
<div>It would be cool if each VM was an Object and you could do</div>
<div>things like:</div>
<div> </div>
<div>( VM2 inImage: anImage</div>
<div> inNameSpace: aNameSpace</div>
<div> usingCompiler: aCompiler</div>
<div> eval:'[ someSmalltalkCode ]' )</div>
<div> </div>
<div>In that way the old VM could call up the new VM and have it</div>
<div>evaluate some code. And when that code was done if ever then</div>
<div>the old VM would continue from there. Or the new Image could</div>
<div>fork into a new thread. etc.</div>
<div> </div>
<div>You would want the debugging to be able to step into this</div>
<div>expression so that you could really see how the VM2 works.</div>
<div> </div>
<div>( VM2 simulation</div>
<div> inImage: anImage</div>
<div> inNameSpace: aNameSpace</div>
<div> usingCompiler: aCompiler</div>
<div> eval: '[ someSmalltalkCode ]' )</div>
<div> </div>
<div>would allow you to see the byte codes being evaluated before</div>
<div>your very eyes. And then the simulation is translated into</div>
<div>C or machine code to make VM2. It would be cool if Squeak</div>
<div>had a portable assembler in it so you didn't have to use C at</div>
<div>all. And that portable Assembler could be </div>
<div>SmalltalkByteCodeStreamForCPUIA32</div>
<div>SmalltalkByteCodeStreamForCPUIA64</div>
<div>etc. Instead of the traditional archain neumonics used in</div>
<div>assemblers we could use Smalltalk messages instead to</div>
<div>generate that machine code.<br></div>
<div>The above expression would allow you to see an image being</div>
<div>loaded up and a name space within that image being selected</div>
<div>and a Compiler being used to compile '[ someSmalltalk ]'</div>
<div>and then being able to see Smalltalk expressions being</div>
<div>evaluated in the debugger in that image and name space</div>
<div>on that VM. And when you hop into a message send then</div>
<div>the byte code debugger would move to the front of the screen</div>
<div>and show the byte codes being executed if desired. It would</div>
<div>be cool if there was a machine code debugger so you could</div>
<div>hop into a byte code instruction and see how it is being</div>
<div>evaluated. It would interpret what was in the registers and</div>
<div>what was on the stack as Objects. There would be inspectors.</div>
<div> </div>
<div>Hopefully this kind of thing would allow multiple VMs and</div>
<div>multiple images and multiple compilers all to be running at</div>
<div>the same time. Hopefully it would encourage VM development</div>
<div>and compiler development such that Squeak could branch</div>
<div>out in all different ways.</div>
<div> </div>
<div>You could have SmalltalkByteCodeStreamForV8 which</div>
<div>would make public the functionality of the V8 Java VM.</div>
<div>And then you could have the V8 VM be one of the VMs</div>
<div>inside of Squeak.</div>
<div> </div>
<div>You can switch from VM to VM at runtime.</div>
<div> </div>
<div>You can use the old VMs to make a new one.</div>
<div> </div>
<div>There are Smalltalk debuggers and byte code debuggers and</div>
<div>machine code debuggers.</div>
<div> </div>
<div>There is the traditional Squeak VM and there are platform</div>
<div>specific VMs that can all run side by side. There are</div>
<div>multiple different Windowing systems all running side by</div>
<div>side. Some native and some not. Some the old Squeak</div>
<div>way and some the Dolphin way some the Java way. etc.</div>
<div> </div>
<div>Squeak's portable assembler </div>
<div>SmalltalkByteCodeStreamForCPU can be used to</div>
<div>output an executable file that has zero or more VMs inside</div>
<div>of it into a Directory on disk with zero or more image files</div>
<div>and souce code files for the different name spaces and</div>
<div>hierarchies. Then you fire up that executable and those</div>
<div>VMs are inside of it.</div>
<div> </div>
<div>It would be cool if there was a PEFileStream that could</div>
<div>be used to make public all the sections inside of a</div>
<div>PE format executable file. With a sequence of tests</div>
<div>going from simple to complex and lots of documentation.</div>
<div> </div>
<div>I do think that there should be a new VM and it should</div>
<div>run right alongside of the old one and be the first of many.</div>
<div> </div>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div dir="ltr">
<div>
<div class="Wj3C7c">
<div class="gmail_quote">On Mon, Sep 1, 2008 at 7:56 PM, Igor Stasenko <span dir="ltr"><<a href="mailto:siguctua@gmail.com" target="_blank">siguctua@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid">2008/9/2 Kjell Godo <<a href="mailto:squeaklist@gmail.com" target="_blank">squeaklist@gmail.com</a>>:<br>
<div>> Where is this new compiler project? Where is NewCompiler? I would like to<br>> see it.<br>> Does anybody know where that book about the Squeak Compiler went to?<br>><br>> the rest down below is all nonsense and I wouldn't read it if I were you.<br>
><br>> i knew this was going to cost me.<br>><br>> What is atomic loading? Does it mean no dependencies or dependencies are<br>> handled?<br>> It seems to me that there needs to be some kind of intellegent dependencies<br>
> manager that works a lot better and is a lot smarter than what has been put<br>> out there so far.<br>><br><br></div>The atomic loading is not about handling dependencies (they are<br>present and adressed as well, of course), but about installing a<br>
number of changes in system replacing old behavior in a single short<br>operation, which can guarantee a safety from old/new behaviour<br>conflicts.<br>
<div><br>> How can I learn about how a good Squeak compiler works? Without years and<br>> millions of dead hours?<br>><br><br></div>Sure, you need some experience in compiling things, especially<br>smalltalk. Or , at least , if you even don't have such experience, but<br>
using Parser/Compiler for own purposes, your experience is valuable as<br>well, since you can highlight different problems or propose better<br>interface or new functionality.<br>
<div><br>> Modularity is very good. I think that all of Squeak should be very self<br>> explaining. This can be done if you put your explanations of what is going<br>> on after the body of the method. Colored source is good too. See Dolphin.<br>
> But without reformating.<br>><br>> I am making picoLARC on <a href="http://sourceforge.net/" target="_blank">sourceforge.net</a>. Each lisp/smalltalk expression<br>> gets compiled by an instance of a Compiler Class. Each expression( let if<br>
> define etc ) has its own KEGLambdaLispCompiler subClass with one<br>> standard method and zero or more helper methods in it. Each Compiler<br>> outputs an instance of a subClass of the Eval Class. An Eval can be<br>
> evaluated at runtime by >>evalWithActivationRec: or it could generate byte<br>> codes which do the same thing via some method like<br>> EvalSubClass>>generateByteCodesOn:usingCodeWalker: where the CodeWalker<br>
> could tie Evals together or do optimizations? Is this not a good design? I<br>> know I like the part about one Compiler Class for each expression and one<br>> corresponding Eval Class. But I haven't done any byte code generation yet<br>
> so I don't know about that part. One Compiler per Eval is not strict. The<br>> ApplicationCompiler can output several different related kinds of Evals for<br>> the different function calls and message sends.<br>
><br>> What is this visitor pattern?<br><br></div><a href="http://en.wikipedia.org/wiki/Visitor_pattern" target="_blank">http://en.wikipedia.org/wiki/Visitor_pattern</a><br>
<div><br>> I don't like the idea of putting byte code<br>> generation into a single Class. But I feel like maybe I don't know what I'm<br>> talking about. To modify the byte code generation for an expression you<br>
> would subClass the Eval Class and modify the<br>>>>generateByteCodeOn: aCodeStream. The initial implementor would try to<br>>>> seperate out the parts that might be modified by someone into seperate<br>
>>> methods that get called by<br>>>>generateByteCodeOn: so these helper methods would generally be overridden<br>>>> and not<br>>>>generateByteCodeOn: unless that method was really simple. So the initial<br>
>>> implementor has to think about reuse and the places where modification might<br>>>> occure. So you would have a lot of simple<br>>>>generateByteCodeOn: methods instead of one big complex one.<br>
><br>> There are all different ways of calling a function or method or query etc in<br>> picoLARC and these are all subClasses of KEGLambdaLispApplicationEvalV6p1<br>> and it seems to work fine.<br>><br>> But overriding >>generateByteCodesOn: is not good enough is it? The<br>
> Compiler Classes can't have hard coded Eval instance creations either<br>> right? The Compiler Class has to be subClassed also and the<br><br></div>The problem in Squeak compiler that you will need to override much<br>
more classes than just Compiler to emit different bytecode, for<br>instance.<br>
<div><br>>>>meaningOf:inEnviron: needs to have a<br>> ( self createEval ) expression in it that could be subClassed and<br>> overridden. And then that subClass has to be easily inserted into the<br>> expression dispatch table that pairs up expressions with expression<br>
> Compilers. So when that table gets made there should be a<br>> ( tableModifier modify: table ) which could stick the < expression Compiler<br>>> pairs in that are needed.<br>><br>> I think that is all that would be required to modify the compilation of an<br>
> expression.<br>> I will have to make these changes to picoLARC so it will be more modifiable.<br>><br>> I think the Compiler should be very modular. For picoLARC one Class per<br>> expression and one Class per Eval seems to work good. Stuffing lots of<br>
> seperate things into a single Class and doing a procedural functional thing<br>> and not an OOP thing does not seem good to me.<br>><br>> I think that the Compiler should be very clean and a best practices example<br>
> with a long comment at the bottom of each method telling all about what it<br>> does. Writing it out and referencing other related methods helps to think<br>> about what is really going on and then a better design without hacks comes<br>
> out. I don't think hacking should be encouraged at all. Hacking just makes<br>> a mess.<br>><br></div>+1<br><br>The design should allow replacing critical parts of compiler by<br>subclassing without the need in modifying original classes.<br>
A so-called extensions , or monkey patching is very bad practice which<br>in straightly opposite direction from modularity.<br><br>I thinking, maybe at some point, to prevent monkey patching, a<br>deployed classes can disallow installing or modifying their methods.<br>
<div><br><br>> And then this practice of not making any Package comments has got to stop.<br>> I think that people who do that should be admonished in some way. I think<br>> that the Package comment for the Compiler should contain the design document<br>
> for it that tells all about how it is designed. If it needs to be long then<br>> it should be long. It should include: How to understand the Compiler.<br>> There should be a sequence of test cases that start simple and show how it<br>
> all works.<br>><br>> And that should go for the VM too. This idea that the VM can be opaque and<br>> only recognizable to a few is not good.<br>><br><br></div>VM tend to be complex. And complexity comes from inability of our<br>
hardware/OS work in a ways how we need/want it.<br>
<div><br>> These should be works of art and not hacked up piles of rubbish to be hidden<br>> away into obscurity.<br>><br>> There is this idea that one should only care about what something does. And<br>> the insides of it are a random black box that you tweek and pray on. But I<br>
> think that the insides should be shown to the world. They should<br>> be displayed on a backdrop of velvet. Especially the Compiler and VM and VM<br>> maker. And then the whole Windowing thing should be modularized so you can<br>
> have multiple different Windowing systems.<br>><br>> And what about having multiple VMs? It would be cool if picoLARC could be<br>> inside of Squeak in that way. It would be cool if one VM was generalized so<br>
> that it could support different dialects and languages. And another was<br>> specific and fast. And you could make various kinds of VMs and images and<br>> output them onto disk without a lot of trouble. It would come with gcc and<br>
> all that junk all set up so it would just work. If you already had gcc you<br>> could tell it not to download it.<br>><br><br></div>What is gcc? And why it required to make VM? ;)<br>
<div><br>> picoLARC has simple name spaces called Nodules where you can have Nodules<br>> inside of Nodules and Nodules can multiply inherit variables from others.<br>> Maybe such a thing could be used in Squeak? Then you could have multiple<br>
> VMs. And VMs inside of VMs.<br>><br>> I think that Dolphin Smalltalk could be held up as an example of pretty.<br>><br><br></div>Maybe, if you know how to deal with license & copyrights when taking<br>their source and blindly putting it to Squeak :)<br>
<div><br>> I hope picoLARC will be another one.<br>><br>> I think that Squeak is pretty in a somewhat cancerous sort of way.<br>> The cancer is all the hacking. That goes on.<br>> The vision is great but the hacking and undocumenting gum up all those big<br>
> ideas.<br>><br>> Sure it's quick but it rots away quickly too.<br>><br>> Undocumented features. In Smalltalk this is less of a problem but in like<br>> Lisp say you make this great feature but then don't document it. You might<br>
> as well have not even made it.<br>><br><br><br><br></div>--<br>
<div>
<div></div>
<div>Best regards,<br>Igor Stasenko AKA sig.<br><br></div></div></blockquote></div><br><br clear="all"><br></div></div>
<div class="Ih2E3d">-- <br>David Zmick<br>/dz0004455\<br><a href="http://dz0004455.googlepages.com/" target="_blank">http://dz0004455.googlepages.com</a><br><a href="http://dz0004455.blogspot.com/" target="_blank">http://dz0004455.blogspot.com</a><br>
</div></div><br><br><br></blockquote></div><br></div>