Syntax & Sematics [was: Re: [Enough already] Re: Proposal3: Make $_ ..]

Mats Nygren nygren at sics.se
Mon Jun 5 17:17:09 UTC 2000


Stefan Matthias Aust <sma at 3plus4.de> wrote:
> Mats,
> 
> I wonder whether you might have a tool ready that would allow me to read in 
> a bunch of Python files and convert them to Squeak.

The short answer is no, unfortunately, but see below.

> I know, that this is 
> probably not really useful as the power of Python lays in its libraries but 
> still, it would be nice if I could some of the millions of available python 
> solutions to work with in Squeak.

I agree on this. The syntactic conversion is at least a start.

> If not, I might try to write a scanner/parser on my own, but I got the 
> impression you're working on a similar idea already.

I was until I got busy studying Squeak. Nows the time to get it
together.

The following is the story.

I have made a tool for describing syntaxes. It is written in C and is
only working in Linux at present. It should be easy to port, since it
really is a filter. (Well and, if linked differently, a compiler, bytecode 
interpreter, evaluator, editor, you know the kind).

Be careful to understand that this is all about syntax. I made the
judgement that syntax is always an essential part in doing a certain kind of
work. So I decided to do some useful work on that.

Approximate definition:
  syntax <-> a (linear time) bijective mapping between a set of trees
and a set of sequences (of non-trees).

When I say Python -> C translation below I mean if a "C-program" is
written in Python syntax it can be translated to a real C-program and
compiled. And if a "Python-program" is written in C-syntax it can be
translated to Python syntax and executed by the Python-interpreter.
Comments and extra empty lines can be retained, this complicates things a little.
But without that there would be no way back.

At present it is possible to translate between different varieties of C
(-like languages). At least one of which is compilable by a normal
C-compiler. With time I can change the syntax of the system as a whole,
and get all code rewritten. (This is close already)

And it is also (close to) possible to translate between Python and C in
the sense described above.
These are really two different things,
 - typeless C <-> (typeless) Python 
 - (typed) C <-> typed Python.

The python code need to be complemented with special comments that are
ignored by the Python-interpreter (they *are* comments) but they are
parsed by the translator and in this way it is possible to represent
C-programs.

<<typed - C semantics>> (the : after # signals this is a special comment
for the translator)

# note that this is actually runnable by python, it just happen to have
# systematic comments
# on types
#: int -> int
def f(n):
   # this computes 1 * 2 * .. * n
   if n = 0:
      return 1
   else:
      return n * f(n-1)

<=>

int f(int n)
{
   // this computes 1 * 2 * .. * n
   if (n == 0)
      return 1;
   else
      return n * f(n-1);
}

<untyped - Python-semantics>

def f(a):
   return a+1

<=> (two different)

// U - universal type
typedef int U; // for example

// parseable by C-compiler
U f(U a)
{
   return a+1;
}

// untyped C, in a more complicated example not parseable by C-compiler
// but it doesnt matter, it wouldn run anyway, it can still be used as an alternative
// syntax for C code, one that might be appreciated by C-people while learning Python.
f(a)
{
   return a+1;
}


When it comes to Squeak / Python they are both untyped so that
particular problem doesnt arise.

There are others however:
 Python / Squeak
 0 origin arrays / 1 origin arrays
 all variables reachable from outside / no, use selectors
 complex things to the left of assigmant / no such
 modules / no
 instance variables spring into existence as needed / declared
 if: elif: else: / nope, binary ifTrue:ifFalse:
 can choose name for 'self' as ordinary parameter / self
  (a (twice) solved problem)
  a_message_(self, 2, 3) / self a: 2 message: 3
  message(self, 2, 3) / self messageI: 2 II: 3
  etc
 the controlstructures are easy though so simple methods are easy, se below

To do a real translation between some python-programs and some
squeak-programs:
- the instance variables has to be known. Some special comments can be
written in python (all code will have to be rewritten a little),
- conventions for how to access variables in squeak will have to be devised,
- how to handle modules
- etc, 
this is a mouthful and the devil is in the details. As I'm sure you know.
However this can be done, getting closer one step at a time.

What can be done soon is the following.
(some details need be fixed, I didnt quite understand the Squeak fileOut
syntax when I did it)
(I estimate less than two days of work to this).

Things parseable (but not executable) by Squeak can also be given to the
translator and a working Python-program will arise.
And in the other direction, things parseable but not runnable by Python
can be translated to working Squeak, that is Python syntax can be used
for fileOuts.

def f(n):
   if n = 0:
      return 1
   else:
      return n * f(n-1)

<=>

" 'fI' is according to a convention for writing nonsqueak methodnames within squeak,
  printf("%d", 2) would be printfI: '%d' II: 2
  that is roman numerals is used for this
   another possibility is to write f:
   but then the python would become f_
"
fI: n
   n = 0
      ifTrue: [^ 1]
      ifFalse: [^ n * f(n-1) ]

And also the syntactic part of Slang is within reach with this tool.
Details remain but it will then be a twoway-translation (in the
syntactic sense). Between a (nontrivial) subset of squeak and a
(nontrivial) subset of C. This can subsume the present Slang-machinery
if wanted, and gives the option of editing the slang code in C-syntax if
that is wanted.

At present the C-code of the tool is not very pleasant to read. I was
doing many experiments (not with drugs) while working with it. I dont
know how interested you are in this, it is a lot of code, making a
framework for getting to and from trees and juggling with trees
inbetween. (And incidentally a graphic editor for trees and functional
reducer to normal form of them).

It is my plan to port this to Squeak, beginning with the syntactic
parts, but you will probably get something working long before that. I
invite you to cooperate on this.

What I can do in short time is to make a www-server that translates what
it gets to the other syntaxes, so you (and other interested) can get a
feel of it.

Are you in a hurry?

My plan is to let it take some time while I learn Squeak and try to
convince people that this is a good idea, and later this year have
working code.

BTW, a good project for someone would be to write a Python-plugin for Squeak.
With time it would get increasingly smother to use until in two years lets say material
would flow without effort between the systems. By that time the Pythan-community
might have choosen to abandon Tk for Squeak as GUIsystem.

Thanks for the interest. I have more similar material, responding to other
postings.

/Mats Nygren





More information about the Squeak-dev mailing list