I am developing a replacement for SmaCC (far from complete).<br>Once it is completed I am interested in being able to define languages<br>and have code in these languages placed directly in a <br>Smalltalk (probably Squeak) image.
<br>I haven't figured out how this will work. <br>Somewhere there will need to be a specification<br>that a given class or method contains not Smalltalk code <br>but code from some language X.<br>The compiler for language X would generate Smalltalk byte codes
<br>that would be stored in the image much like bytes codes of Smalltalk methods.<br>Some example languages:<br> 1) Regular expressions<br> 2) bash shell scripts (Linux)<br> 3) sed scripts (Linux)<br> 4) C<br><br>
There are tons of issues here to be resolved much later.<br><br>Meanwhile, I thought of two languages worth adding to the list: <br><br> 4) SLang<br> 5) Assembler for the virtual machine.<br><br>The reason for 4) is to allow users to write code that they mean
<br>to be efficient. My understanding is that SLang is C like and gives<br>similar performance. Of course translating SLang methods to byte<br>codes will slow it down but it should still be useful when performance matters.
<br>The reason for 5) is the same though I admit it would be odd for<br>someone who is so committed to performance to write his code in<br>assembler to then choose to write his code for a Smalltalk virtual machine.<br><br>
Questions:<br><br> a) Is it a good idea to be able to embed other languages/language compilers<br>into a Smalltalk image? I am sure many will object to this idea <br>but I hope there are some supporters as well, as least in principle.
<br>At this point I am just interested in hearing what the pros and cons are.<br><br> b) I am just writing an easier to use, more powerful, and faster SmaCC.<br>To generate byte codes from a language specification <br>
more compiler utilities are needed. <br>Has anyone done any work in this area, <br>either in Squeak or other versions of Smalltalk?<br><br> c) If efficiency is desired when converting a regular expression<br>
to a finite state machine, then one option is to represent <br>
the finite state machine in machine code or a language that<br>supports computed gotos and collections of goto locations. <br>A transition in the finite state machine might be represented by<br>
an instruction of the form:<br><br>goto followStatesOfThisState at: inputValue ifAbsent: [self reportInvalidInputAndExit].<br><br> If I write a compiler to generate Squeak virtual machine bytes codes <br>corresponding to an input regular expression
<br>will I be able to generate instructions<br>equivalent to the goto instruction above?<br><br><br>Of course there are many problems. The biggest concern I have is that<br>we can't allow the user to generate virtual machine codes that blow the
<br>system out of the water. Admittedly this problem could kill the whole<br>idea. Perhaps there are security issues as well.<br><br><br>Once the above has been completed I hope to start another project.<br>I want to build a replacement for the Linux shell bash which I call squash.
<br><br>Squash would be a running Squeak image which provides the user with an<br>interface similar to when running the bash shell in Linux.<br>It would also be possible to write squash shell scripts that provided functionality
<br>similar to bash shell scripts. Unlike bash shell scripts, however, squash shell<br>scripts would be compiled into files of byte codes called squash executable files.<br>It would also be possible to have other languages whose compilers generate
<br>squash executable files and have the squash shell execute them.<br><br>What does all of us give us (at least on Linux)?<br>Well, I think it gives us:<br><br> 1) A version of Squeak that interfaces with other languages much better
<br>than now, thus making Squeak (Smalltalk) much more useful to the<br>software community in general.<br> 2) The option of using Squeak where scripting languages are now used. <br>And since in the Squeak case the code is compiled into byte codes it
<br>should run faster than would be the case for many scripting languages.<br> 3) Since the squash shell scripting language <br>will have the entire Squeak image available to it, <br>it should be a powerful language to write shell scripts in.
<br><br>Comments welcome, especially negative (but constructive) ones.<br>I am looking for comments on whether this is a good or bad idea,<br>not a detailed description of what all the problems are, <br>though the latter may be useful too.
<br><br>Ralph Boland<br>