Squik language features

Anthony Hannan ajh18 at cornell.edu
Thu Apr 17 06:04:33 UTC 2003


Jesse Welton <jwelton at pacific.mps.ohio-state.edu> wrote:
> Anthony Hannan wrote:
> > I am using the VM term broadly to include all the none image parts of
> > Squeak.  The interpreter is one part of the VM which does have its own
> > instruction set.
> 
> In that case, I'm confused as to why you've said this will have no VM.

Today's non-image parts are in the VM.  I want to move these parts into
the image, so there will be no VM left.  The Interpreter is currently
part of the VM but it will move to the image like everything else.  The
interpreter also happens to have its own instruction set.  And compiled
methods also happen to use this instruction set, though they don't have
too.  In particular some will use machine code, particularly the ones in
Interpreter itself.

These machine code methods will be platform dependent and will have to
be translated when moved to other platforms.  One way to handle this is
to keep the machine code in separate library files that can be compiled
on different machines.  These libraries are similar to a VM but they are
different because they are represented as objects in the image.  Today,
there is no VM object in the image, and even if we did it would be too
big including too many things.  The idea of the NoVM project is to
separate this out into different behaviors/objects like Interpreter,
GarbageCollector, etc., and to implement as much as possible in portable
and convenient Smalltalk and rely on adaptive optimization to make it
fast.

>From now on I will separate the NoVM project from the Squik Language
project, which the rest of this email addresses.

> > > > No Instance Variables
>
> > The purpose of removing instance variable names and replacing them with
> > accessors is so subclasses can override them.  Automatic inheritance of
> > state is too binding to the implementation.  Getting rid of instance
> > variables names means you only inherit behavior.  The number of fields
> > in a subclass is under control of the subclass and is not automatically
> > inherited from the superclass(es).
> 
> Okay, this is actually kind of interesting.  I still think you'll have
> to consider how the field information is displayed in introspective
> tools like the debugger.  The names are far too valuable to simply
> give up.

The compiler can look for all methods that call instVarAt: and do
nothing else (an accessor) and display those in the debugger.

> (Also, now I don't understand what you were talking about
> when you described the compiler automatically recompiling accessors
> when superclasses' ivars are moved about due to MI.)

I was just trying to say that the compiler can warn you when field
positions have to change.  We can even allow the programmer to specify
accessor names and request they be generated and maintained
automatically, relieving him the burden of managing field position.  But
at the low level, which I have been describing, it is just methods that
call instVarAt:(put:).

> > > > No Global or Pool Variables
> > > 
> > > Having scoped environments in place of global and pool dictionaries
> > > seems like a good generalization, but I'm skeptical about tying this
> > > rigidly to the class heirarchy.  One shouldn't need to change the
> > > inheritance structure of a class in order to specify what package
> > > facilities are available to it.  These are independent concepts.
> > 
> > You can think of environments being independent of classes, but you can
> > also think of them as one in the same: one environment per class.  This
> > provides finer granularity of imports and reduces management complexity
> > of maintain and environment hierarchy and a class hierarchy that is
> > intertangled with each other.
> 
> I'm not sure that using inheritance to determine environments does in
> fact reduce management complexity.  It certainly doesn't provide finer
> granularity in principle, since uncoupled environments could be
> defined in arbitrarily fine heirarchies independent of the class
> heirarchy.
> 
> > Finer granularity of imports means a class can only use interfaces it
> > directly imports (via its class variables and inherited class
> > variables).  This provides finer control over security: you can only use
> > what you import.  [...]
> 
> But by linking this to inheritance, you're limiting your flexibility
> to provide security, in that all descendants of a given class always
> have access to that class's imports.

You make a good point: just because some superclass imports some
interface it doesn't mean the subclass should import it as well.  So let
me propose an alternative that I have thought about as well.

Let's separate environments from classes, as you propose.  An
environment is a dictionary that can inherit from zero or more other
environments.  This environment hierarchy is independent of the class
hierarchy.  The compiler always runs with respect to some environment
where it looks up env (global) variables and selectors.  Selectors are
looked up in the factories found in the environment.  An editor always
has an environment associated with it, which is used to lookup all words
(variables and selectors, except temp variables) in a method or
expression being compiled.

Class variables can be removed and replaced with constant methods.  A
constant method is a method that just returns the object held as its
sole literal.  To change the constant you just replace the method with a
new constant method containing a different literal.  The literal can be
any object.

> > > > Implicit Temporary Variable Declaration
> > > 
> > > What about lexically nested contexts?  You need a way to determine the
> > > scope of each temp var.
> > 
> > The compiler can figure this out automatically, and usually does a
> > better job than the lazy programmer that will declare all temps at the
> > top level even if some are only used in a block.
> 
>   testAutoScoping
>     setter := [:val | var := val ].
>     getter := [ var ].
>     ^Array with: setter with: getter
> 
> What's the scope of var, and how does the compiler know?

A temp var is declared in the scope where it is first assigned.  So in
the above example, var is declared in the setter block scope, but the
var in the getter block scope is undeclared and will raise an exception.
 If you wanted to use the same var in both blocks you would have to
declare it in the outside method scope, such as:

	testAutoScoping
		var := nil.  "var is now declared in method scope".
		setter := [:val | var := val].
		getter := [var].
		^ {setter. getter}

>   testAutoScoping2
>     [ 1 to: 1000 do: [:i | doSomething: i] ] fork.
>     1 to: 100 do: [:i | doSomethingElse: i].
> 
> Any interference here?

No. the two i's are two separate variables declared (as args) in two
independent block scopes.

> By the time you've refined your scoping rules
> to the point that the compiler can come up with the right answer to
> all reasonable cases you've anticipated, will they really be easy
> enough for programmers to anticipate the behavior of a given case at a
> glance?  And are you sure you want to completely disallow shadowing?

Maybe it would be clearer if we had a keyword like "declare var _ 0" (a
la C/Java) or maybe we should just leave the temp var list.  But I think
the rule "A temp is declared in the scope of first assignment" is not
that hard.  Plus it forces people to explicitly initialize there temps.

As far as limiting shadowing, I don't think that is a big deal, just use
another name.  All temp scopes are visible in the same method so it is
easy to pick another name, plus not shadowing is less ambiguous.

> > > > Factories
> > > 
> >  Factories provide a protected interface to its
> > instance class so users that only want to create instances don't also
> > have the capability to change methods (again capability security).
> 
> Isn't that more naturally controled by granting or witholding
> capabilities via the compiler interface?

I don't see what you mean.

The way I see it is that you have a compiler that creates compiled
methods from text, then you add that compiled method to a class using
Behavior>>addSelector:withMethod:.  You can't restrict
#addSelector:withMethod: since you will need it for your personal
classes.  But you want to prevent it for other classes.  The only way to
prevent it is to not give you a handle to the other classes.  Hence, the
factory idea.  You can get a handle to the factory of the other class,
but it won't allow you to add methods, only to view methods.  Maybe
factory is a bad name, maybe interface would be better, since we're also
using it to lookup selectors in.

> > > > Selectors are more than just symbols. A Selector points back to the
> > > > interface that it is a part of. The compiler binds message sends to the
> > > > selector found in a visible interface. If more than one interface
> > > > contains the same selector name than the compiler pops up the choice to
> > > > the programmer. The programmer usually knows which interface he is
> > > > targeting. The interface chosen is prefixed to the message, such as
> > > > "block blockClosure.value".
> > > 
> > > This may be a good way to handle selector collisions between
> > > protocols.  But it may not: it's verbose, and could produce alot of
> > > false positives.  Consider the implementation of Dictionary, which
> > > uses both BlockClosures and Associations.  Even though there's no
> > > internal conflict in interpreting #value sent to either a block or an
> > > association, Dictionary code would have to specify which protocol
> > > applied in each case, simply because it has access to both.  (That is,
> > > unless you also propose to add static type checking.)
> > 
> > In this case you would separate the #value protocol out into a new
> > superclass that both BlockClosure and Association would inherit from
> > (remember multiple inheritance is allowed).  The senders of #value would
> > then bind to this new superclass protocol.  This makes explicit the
> > shared protocol that would otherwise be implicit and subtle.
> 
> The point is that this is not a shared protocol.

But it is a shared protocol in the example above.  You're sending #value
to a dictionary element not caring whether it is a block or an
association.  I call that a shared protocol.

> The #value method means something very different in each case.

If they really do mean something different then they really are
different selectors that just happen to have the same name.  I agree
Association value and Block value do mean different things, one means
retrieve value, the other means execute and return result.  I would
prefer Block value be changed to Block eval.  Then in your example above
you would either test the class and call the appropriate method or add a
new method that is polymorphic with both.  I would add a #valueOrEval
method to each, and call that.  The #valueOrEval selector would be added
to a new superclass (possibly named AssociationOrBlock) and inherited by
both Association and Block.

Each selector should have its own meaning, subclasses can implement that
meaning however they like but the meaning remains the same.  If two
selectors with separate meanings happen to have the same name such as
Association value and Block value then you can differentiate them by
adding prefixes, as in Association.value and Block.value.

Restricting message sends to a single meaning (possibly prefixed)
enforces better structure (modularity) in the code.

> > > To what interface does a method like #asString belong?  If to Object,
> > > you're losing much of the encapsulation you're working for.  If to
> > > String, you've got an unworkable problem with interface through
> > > inheritance.
> > 
> > Like #value, you can move it out to a shared superclass.  But I don't
> > think it is a problem leaving it on Object if you believe all object
> > should understand it.
> 
> You miss the point, which is that all objects should understand it,
> but possibly not all objects should have access to it.  Let me think
> what would make a better example...  How about #storeDataOn:, used by
> tools which serialize objects to disk, but likely something you want
> to restrict access to outside that context.

Ok, #storeDataOn: (or #asString) would be a selector in a new superclass
of Object.  All objects would understand it, but only environments that
have access to that superclass would be allowed to send it.

> > > > Multiple Inheritance
> > > 
> > > This seems like another overuse of inheritance to me.  In order for
> > > MyBlockClosureReplacement to *simulate* a BlockClosure, it has to
> > > inherit the implementation (and state!) of BlockClosure?
> > 
> > You don't inherit state because there are no instance variables.  Any
> > method implementations that you inherit that you don't like you can
> > override.
> > 
> > >  That sounds annoying to work around.
> > 
> > I don't think so.  It is likely that you will want to inherit most of
> > the method implementations.  You will mostly be concerned with
> > overriding accessors.
> 
> This does make more sense when state is not inherited.  Hmm, it does
> require exposing a class's implemenation details (including access to
> its dependencies) to any replacement class, though.

I don't follow the last sentence.

Thanks for interogating me, Jesse.  I think we're making progress.

Cheers,
Anthony



More information about the Squeak-dev mailing list