Classes as Packages (was: Harvesting infrastructure)

Nathanael Schärli n.schaerli at gmx.net
Tue Nov 19 16:28:52 UTC 2002


Hi Anthony

I agree to most of what Andrew said. However, there *is* a difference
between Trait composition and MI that has **major** impacts on
usabulity, understandability, and compatibility to single inheritance!
Furthermore, I think that this is the reason why people didn't accept
MI, but (hopefully) will accept Traits. I try to explain it below...

> I read the new draft version of Traits, and I commend you, 
> Stephane, Oscar, and Andrew for your thorough work. 

Thanks!

> I think abstract classes without state 
> are analogous to traits.

I agree. Abstract classes without state that do not have to inherited
from Object and cannot be instantiated, are Traits. However, they way
Traits are combined is different from how classes are combined in MI:
Whereas Traits are combined using "Trait composition" (which is
*orthogonal* to single inheritance), classes are combined using multiple
inheritance (which is an *extension* of single inheritance).

I far prefer the properties of Trait composition over MI, especially if
Traits (resp. MI) are used more extensively. One reason for that is the
fact that Trait composition has no semantics. This means that the
semantics of a method is independent of whether it is defined in a Trait
or in the classes that use the Trait. We call that property the
"flattening property", because it allows to work (view and edit) any
class (even one that is built from many deeply nested Traits) as if it
were implemented in a traditional single-inheritance manner. 

In our practical experience with Traits, this property has been
invaluable. As an example, we have refactored a big part of the Squeak
collection hierarchy using Traits. Thereby, we have used Traits very
extensively: For example the class correspnding to
SequenceableCollection uses more than 20 Traits (counting direct and
indirect uses). But because of the flattening property, this has *no*
negative impact on the understandability of the class as a whole.
Factoring a couple of methods into a new Trait is as easy as putting
them into a new protocol in the current version of Squeak. It has no
semantical meaning and therefore it comes at zero costs: Even after the
methods are factored out into a Trait, the (flattened) class looks
identically as before: This means it has an identical number of methods
and all the methods have identical implementations. Furthermore, the
class still has only one superclass (namely Collection) and it uses
"super" to unambiguously refer to features implemented in this
superclass.

Whereas this property has been extremely helpful for us while we were
refactoring the hierarchy, it would be even more important for all the
Squeak users, because they wouldn't even notice (if they don't want to)
that their well-known collection classes are now written with Traits. If
they don't want to know about Traits, the system looks the same way as
before and they can work with the system the same way as before! This is
not only true for the collection hierarchy, but it would be true for the
whole image. Even if the methods in the existing classes would be shared
by factoring them out into thousand of reusable Traits, a user can work
with the image as if Traits wouldn't even exist. (There is a preference
to not show Traits in the browser and then it is entirely impossible to
tell whether a class is written using Traits or not).

Now imagine that you would do the same fine-grained refactoring with MI.
First of all, the class SequenceableColection would not only have the
superclass Collection, but it would inherit from many parallel
inheritance chains. Thus, it would be harder to understand for people
who are used to the single inheritance and the existing collection
hierarchy. Furthermore, the usage of "super" in the existing methods
will sometimes be ambiguous because more than one superclass provides an
implementation for the sent message. Thus, we need to change the source
code of the methods using a new syntax (e.g., "super.Collection do:").

Last but not least, MI associates a semantical meaning to the place
where a method is defined. As an example consider the following MI
graph:

	A        B
	^	   ^
	|	   |
	|	   C
	|        ^
	|        |
	----|-----
	    |
	    D

Lets assume that A and B implement a method foo, and C contains a method
bar sending "super foo". Unlike with Traits it is not possible to move
the method bar into the class D, because the *static* place where super
is used (i.e., defined) matters. If it is used in C, super refers to A,
however if it is used in D it may refer to A or C and therefore causes
an ambiguity.

As a result, MI does in general not allow a programmer to understand the
behavior of a class without knowing where (in what class or superclass)
the individual methods are implemented. This means that MI *urges* a
programmer to look at the code in the way it was written and consider
all the invlved building blocks. In addition, it generally requires
changing the source code of methods when people want to refactor
existing code. According to my experience, this is precisely what people
don't like about MI, especially if MI is used extensively. As I said,
imagine a class that is perfectly refactored in 50 superclasses that are
aligned in 8 parallel chains. People typically hate that, because it is
very hard to understand the meaning of the whole.

With Traits, it does not matter whether a class is written by composing
50 Traits with potentially conflicting methods, because the user is *not
urged* to understand and view/edit the class the way it was written.
Instead, one can flatten the class at any nesting level and the code
still makes perfect sense and has identical semantics. As I said, this
flattening has no effect on individual methods: All the method bodies
remain the same.

> 	Let me go over the so-called disadvantages of MI raised 
> by the Traits paper and try to refute them.
> 	"Limited compositional power" - This was the most 
> compelling argument in favor of traits.  But the traits 
> solution to the SyncReadWrite example uses 'super read' in 
> its trait without implementing it, making it a requirement of 
> the trait (implicitly or explicitly, it does not matter).  
> Since it is a requirement you could just as well name it 
> anything like asyncRead and not use super.  In fact, use of 
> super in traits does not mean anything when looking at the 
> trait by itself, so I question its appropriateness.  In 
> multiple inheritance, I would have SyncReadWrite call 
> asyncRead and define it as 'self subclassResponsibility'.  
> Then I would have SyncA inherit from A and SyncReadWrite, and 
> define SyncA>>asyncRead as 'super.A read' and
> SyncA>>read as 'super.SyncReadWrite read'.

I'm glad you mentioned this example. It is clear that you can factor out
the synchronization code with MI, but you still need to duplicate 4 (!)
methods (read, asyncRead, write, asyncWrite) for *every* class you want
to make synchronized! This means that it just exchanges duplicating two
4-line methods with duplicating four 1-line methods. In my eyes, this is
not a satisfactory solution, especially since there are other approaches
(mixins and Traits) that do not require any code duplication. 

However, there is a much more improtant reason why I think that Traits
composition exhibits much nicer composition properties than MI. In fact,
this is a good example to illustrate that MI often urges a programmer to
implement some code in a clumsy way just because he wants to benefit
from reuse.

Let's start with the example where where we have a single class A (with
methods read and write) and we want to make a synchronized version
SyncA.

In a single-inheritance language, every programmer would to do this by
making a subclass SyncA with the methods read and write:

SyncA>>read
	self acquireLock.
	return := super read.
	self releaseLock
	^ return

SyncA>>write
	self acquireLock.
	return := super write.
	self releaseLock
	^ return

Now assume again that there is another class B that also provides read
and write and we want to synchronize it in the same way. With Traits,
this is completely straight forward, because we can simply reuse the
methods in SyncA as they are: We just define a new Trait RWSync, we make
SyncA use this trait, and we drag the methods read, write, acquireLock,
and releaseLock from the class SyncA into this RWSync. If we look at the
class SyncA in the flattened view, it is impossible to tell any
difference, because this class still has the superclass A and it
contains the same 4 methods. Only the compositional view shows us that
these methods are actually implemented in a reusable subtrait rather
than in the class itself. (Think of the reusable subtrait SyncRW as just
being a reusable protocol. Traits are not more complicated than
protocols).

In order to create the synchronized version of B, we just create a class
SyncB that inherits from B and uses the Trait SyncRW. Again, the class
SyncB looks as if the synchronization code were implemented directly in
itself. As a summary this means:

- The programmer can write the synchronization code as if he didn't want
it to be reusable.
- Making the synchronization code reusable is as easy as dragging it
over into a subtrait
- No methods needs to be added and the existing methods do not have to
be modified
- Another user does not need to know anything about the Subtrait. He can
still understand, view, edit the code of the synchronized classes as if
it was implemented by using code duplication and single inheritance!
- The user can also look at the classes in the compositional view and
then he sees that they are actually implemented using a reusable Trait.

Now let's see what happens with MI. If you only want to create a
synchronized version of one class, you can do it in the traditional and
well-understood way. However, if you want to make this code reusable, MI
requires yoou to do things you would not do otherwise (cf. your
description of the MI solution):

In particular, for every synchronized subclass, you need to write 4
*new* methods

SyncA>>asyncRead
	^ super.A read

SyncA>>asyncWrite
	^ super.A write

SyncA>>read
	^ super.SyncRW read

SyncA>>write
	^ super.SyncRW write


Besides the fact that this methods still have to be duplicated for every
synchronized subclass, it also requires the user to understand MI. In
fact, these two things are caused by the same limitation of MI: The fact
that super is statically interpreted (and therefore location dependent)
is precisely the reason why you cannot make these four methods reusable,
and it is also the reason why the user needs to know where they are
implemented and in order to understand them.

This means that just because the implementor wants to achieve better
code reuse, he *forces* another user to deal with more complicated code.
Thus, unlike with the Traits solution, there is a clear tradeoff between
reusablility and understandability, and I think this is exactly what
people don't like about MI.
As a comparative summary, I would say:

- The programmer needs to write different synchronization code if he
wants it to be reusable.
- Making the synchronization code reusable means that it has to be
reimplemented
- For every application of the (pseudo-) reusable code, the programmer
syill has to duplicate for methods.
- Another user is forced to understand the code the way it is written,
which is with multiple inheritance. I claim that most of the users would
prefer to look at the code in traditional single-inheritance view,
because they find that generally more understandable and
straight-forward.


> In fact, use of 
> super in traits does not mean anything when looking at the 
> trait by itself, so I question its appropriateness. 

Well, the use of super in a method implemented in a Traits means the
same thing as the use of super in a method implemented in a class: It
simply causes the method lookup to start in the superclass of the class
that finally contains the method. In case of Traits, this superclass is
just unknown (parameterized), but this does not mean that "it does no
mean anything". In fact, this is the same as for mixins, which are
basically paramterized subclasses that can be reused for an arbitrary
superclass.

A programmer simply has to know that every method defined in a Trait
behaves exactly the same way as if the method would be implemented
directly in the classes that use it and he immediatly understand the
meaning of "super".


> 	Finally, I think MI is more conducive to reuse than 
> Traits.  If you want to reuse a class that is outside of your 
> single inheritance hierarchy, in MI you can just inherit from 
> it, but in Traits you have to convert the class to a trait 
> which is difficult if you also want to include its inherited behavior.

As Andrew already said, I wrote a tool that does this automatically.
Besides that, I can only refer to what I wrote above: Refcatoring is
much easier with Traits because they do not require modification of
existing methods, however MI generally does. (See example with RWSync).
Thus, refactoring with Traits is simply "drag and drop". In addition,
even if someone would refactor the whole Squeak image with Traits, it
has zero negative impact on understandability, because a user can still
see the system as it is now. In addition, a refactoed system would
*offer* (but not *urge*) a user to look at a class in a structured view
and see how the building blocks are composed.


Nathanael



> -----Original Message-----
> From: squeak-dev-admin at lists.squeakfoundation.org 
> [mailto:squeak-dev-admin at lists.squeakfoundation.org] On 
> Behalf Of Anthony Hannan
> Sent: Tuesday, November 19, 2002 5:55 AM
> To: squeak-dev at lists.squeakfoundation.org
> Subject: RE: Classes as Packages (was: Harvesting infrastructure)
> 
> 
> Nathanael Scharli <n.schaerli at gmx.net> wrote:
> > I agree with you. Your critics on MI is just about what 
> inspired us to 
> > do the Traits work. Have a look at the latest draft of the Traits 
> > paper
> > 
> (http://www.iam.unibe.ch/~schaerli/traits_draft/traitsDraft.pd
> f) and you
> > will find a detailed description of good reasons why MI 
> ddidn't make it
> > into recent languages. I hope that Traits are going to be more
> > successful ;)
> 
> I read the new draft version of Traits, and I commend you, 
> Stephane, Oscar, and Andrew for your thorough work.  But I 
> have to say, I don't see a clear advantage of it over 
> multiple inheritance without state. 
> "Without state" means replacing instance variables with 
> primitive accessor methods that only work on concrete 
> instances of their method class.  Every class has to define 
> its own accessors, but they can be generated automatically 
> from superclasses.
> 
> 	Let me go over the so-called disadvantages of MI raised 
> by the Traits paper and try to refute them.
> 	"Limited compositional power" - This was the most 
> compelling argument in favor of traits.  But the traits 
> solution to the SyncReadWrite example uses 'super read' in 
> its trait without implementing it, making it a requirement of 
> the trait (implicitly or explicitly, it does not matter).  
> Since it is a requirement you could just as well name it 
> anything like asyncRead and not use super.  In fact, use of 
> super in traits does not mean anything when looking at the 
> trait by itself, so I question its appropriateness.  In 
> multiple inheritance, I would have SyncReadWrite call 
> asyncRead and define it as 'self subclassResponsibility'.  
> Then I would have SyncA inherit from A and SyncReadWrite, and 
> define SyncA>>asyncRead as 'super.A read' and
> SyncA>>read as 'super.SyncReadWrite read'.
> 	"Acessing overriden features" - To limit this problem 
> in multiple inheritance, I would restrict classes to naming 
> their immediate superclasses only as in 'super.A', where A is 
> an immediate superclass. 
> I agree this does tie the hierarchy into the code a bit, but 
> I think it can be managed pretty easily.  If the A superclass 
> get removed from the class all methods that have super 
> references to it could be brought to the users attention, or 
> they will raise an error if executed.  It is not much 
> different from removing an instance variable from a class and 
> not updating methods that reference it.
> 	"Conflicting features" - Since I am proposing no 
> inheritance of state, this is not a problem.  Even the Traits 
> paper says "Whereas method conflicts can be resolved 
> relatively easily (e.g. by overriding), conflicting state is 
> more problematic", page 4.
> 	"Two contradictory roles: instance generators and unit 
> of code reuse" - The Traits paper claims that this pushes 
> classes to be complete classes instead of just small units of 
> reusable code.  I say what about the concept of abstract 
> classes, they are not meant to be complete or instantiated, 
> just shared by subclasses.  They don't even have to be 
> subclasses of Object.  I think abstract classes without state 
> are analogous to traits.
> 
> 	I think that is my whole point: abstract behaviors and 
> traits are analogous.  So why add a new way of defining 
> behavior and putting them together, yielding two types of 
> behaviors and composition in the same system.  Even in the 
> future work section of the Traits paper, they talk about 
> replacing single inheritance and adding state to traits.  
> This would wrap them back around to multiple inheritance but 
> in a nested composition form instead of an inheritance form, 
> which would basically be the same thing.
> 	Finally, I think MI is more conducive to reuse than 
> Traits.  If you want to reuse a class that is outside of your 
> single inheritance hierarchy, in MI you can just inherit from 
> it, but in Traits you have to convert the class to a trait 
> which is difficult if you also want to include its inherited behavior.
> 
> Cheers,
> Anthony
> 




More information about the Squeak-dev mailing list