[squeak-dev] Traits as a way of defining an interface instead of an abstract superclass

Fri Oct 12 23:04:13 UTC 2012

On 12 October 2012 23:09, Colin Putney <colin at wiresong.com> wrote:
> On Fri, Oct 12, 2012 at 1:02 PM, Chris Cunnington
> <smalltalktelevision at gmail.com> wrote:
>>> I'd love to see a tool that would let us discover
>>> these latent protocols, name them, and then view and manipulate the
>>> system based on them.
>>
>> An automated code archaeologist? The only thing I can think of similar to
>> that is studying old images (i.e. 2.7) and comparing them to today. What
>> kind of criteria would such a tool use? <Spock>Fascinating.</Spock>
>
> Well, I'm imagining something a bit like a type inference tool. But
> instead of figuring out what the concrete classes of receivers are, it
> would just collect a list of messages that get sent to it.
>
> So imagine that we pick a class at (pseudo) random, say ChangeSet, and
> focus on one of its variables, say 'structures'. If we look at all the
> methods of ChangeSet, we can see what messages get sent to
> 'structures'. Here's the list methods that use 'structures':
>
> ChangeSet >> noteClassForgotten:
> ChangeSet >> noteClassStructure:
> ChangeSet >> structures
> ChangeSet >> askAddedInstVars:
> ChangeSet >> askRenames:addTo:using:
> ChangeSet >> askRemovedInstVars:
> ChangeSet >> checkForConversionMethods
> ChangeSet >> absorbStructureOfClass:from:
>
> #noteClassForgotten: sends the following messages to 'structures':
>
> #ifNil:
> #includesKey:
> #removeKey:ifAbsent:
>
> #noteClassStructure sends these messages:
>
> #ifNil:
> #includesKey:
> #at:put:
>
> So we've got a little histogram of messages:
>
> #ifNil: - 2
> #includesKey: - 2
> #removeKey:ifAbsent: - 1
> #at:put: - 1
>
> And we can continue on down the list of ChangeSet's methods building
> up our statistical database. We could do the same thing with temporary
> variables, and even expressions that never get stored into a variable,
> but do get sent messages.
>
> A type inference tool would compare the sets of messages that an
> object receives to the actual classes in the image and try to figure
> out which classes it could be an instance of.

http://www.squeaksource.com/SqueakCheck does exactly this: given a
Theory (an arity 1 method on a TheoryTestCase marked with the <theory>
pragma), it finds all messages sent to the Theory's argument (call it
foo), and to foo class. It then finds all possible Class types that
match that protocol, and uses those classes as seeds for generating
random data to throw at the Theory.

> What this protocol tool
> would do differently is ignore the actual classes in the image, and
> instead try to find patterns in the message sends, and try to shed
> light on programmer intent. So we might ask, what other messages are
> associated with #includesKey:, and how strong is the association?
> What's the largest set of methods that are sent to at least 95% of
> objects that receive #includesKey:? What objects fall into the other
> 5%?

This is a much more complicated/difficult problem than what
SqueakCheck tries to do.

> There would probably be a lot of noise in the data - #ifNil:, for
> example, might confuse things a bit. But I bet there's a lot of signal
> as well, and with a bit of direction from the user, that sort of tool
> might be able point out, for example, a class that almost implements
> the Magnitude protocol, but is missing a couple of methods.

I wonder if some of the natural language tools can't help here: I'm
thinking things like bigram/trigram models and the like. My brain's
just melted from being up too late. There's another probabilistic tool
that can divide up high dimensional spaces to maximise differences in
data points. If I remember the name I'll mention it :/

frank

> Anyway, thanks for bringing up this topic. It's interesting stuff.
>
> Colin
>