Squeak and type inference

Tue Oct 26 21:34:47 UTC 1999

"Michel Tilman" <mtilman at acm.org> wrote:
> I would like to know a bit more about what's currently going on in
this
> area. Anyone got some good references?
> 

I've been piddling at this for a while.  By far the best read I've seen
so far is the one Dan recommended:

	Ole Agesen, "The Cartesian Product Algorithm: Simple and Precise Type
Inference of Parametric Polymorphism", ECOOP '95.

This paper seems very clear and direct to me, and plus it ties in the
algorithms to a real programming language.  The paper is available on
the web somewhere; I think you can get it off the Self research pages.

Other good references are:

	- David Grove and Greg Defouw and Jeffrey Dean and Craig Chambers,
"Call Graph Construction in Object-Oriented Languages", OOPSLA '97. 
This paper generalizes the common algorithms in use.  Don't read this
first, but do read it some time.

	- Olin Shivers's dissertation and related papers,
http://www.ai.mit.edu/people/shivers/citations.html, which I can't seem
to download right now.  His stuff is oriented towards Scheme, but the
basic problem is the same: you've got function calls that you want to
match up to specific functions.  This is the paper which made me realize
that call-graph construction and type-inference are basically the same
problem--solve one and you solve the other.  Either this or the Agesen
would be a good first read.  

In addition to reading, I've assembled a basic type inferencer.  It's
closer to Shivers's basic algorithm than to Agesen's algorithm; it is
faster but it is also less precise.  (not that it's "fast" by any
stretch of imagination!)  The algorithm basically goes like this:

	1) Develop conservative type constraints, eg x := y means Type[x] >=
Type[y]

	2) Find a solution to the set of constraints.  To do this, initialize
the type of a literal to be the type required by the literal, the type
of everything else to be the empty set, and then "push" the literal
types forward through the constraint graph until nothing changes any
more.  For instance, if you have Type[x] >= Type[12], then set Type[12]
== {SmallInteger} and then add {SmallInteger} to Type[x].  If
{SmallInteger} wasn't already a member of Type[x], then add
{SmallInteger} to all types constrained to be a supertype of Type[x]. 
And so on.

	3) Return to 1 and try to find a tighter set of constraints.  Repeat
until neither 1 nor 2 changes anything.

The main difference between the algorithm I implemented and CPA is, I
only analyze each method one time, where CPA analyzes a method
separately for each possible combination of parameter and receiver
types.  For example, <SmallInteger>+<SmallInteger> would be analyzed
separately from <SmallInteger>+<Float>.  (In addition, Agesen's paper
talks about building optimistic instead of conservative call graphs and
then expanding them as it goes, but that's a separate issue, and I don't
even see how this works.  More thought needed on this one....)

Finally, there are two things I should have together in the next month
or two that might be of interest.  First, I have a parser which keeps
around variable names (Squeak's parser is geared towards compiling and
can make other kinds of analysis difficult.  The Refactoring Browser
includes a nice-looking parser, but I didn't realize it in time; I might
eventually end up using RB's parser, but currently the RB parser
probably doesn't handle Squeak's assignment-to-array-of-variables
syntax.  I could be wrong, of course).  I'll post a type inferencer
along with the parser when I get to it.  It's pretty neat to hit alt-n
and only get the relevant senders.  (It's also pretty annoying to hit
alt-n on something like "next", and to fall asleep waiting for it to
finish).

Second, I hope to precisely have a survey paper summarizing type
inference in Smalltalk by the end of November, including a summary of
the literature I've been able to find.  It won't be anything
publishable, but it might help people find the current state of things
more quickly than I did!

Now, if only I can manage to keep from dropping from exhaustion before
this is all done.  :)

Lex