[FIX][KCP] KCP-0112-FixCanUnderstand

Tue Dec 16 01:51:56 UTC 2003

With respect to #respondsTo: and #canUnderstand:
I feel rather cross, because this issue has *already* been discussed.
I provided an [ENH] which defined

    Object>>honestlyRespondsTo:
    Behavior>>canHonestlyUnderstand:

I can well believe that my code was buggy, but it's like a slap
in the face to have the entire discussion and code ignored.

The way Squeak currently implements #respondsTo: and #canUnderstand:
is compatible with the way other Smalltalks do it.

To be sure, the operations have misleading names.  If it were a matter
of designing a new language, they'd be good names to avoid.  To be sure,
there is a fair bit of broken code that uses #respondsTo: when it
should not.  But the best response is to fix that code, not change the
semantics of a basic operation.

	I decided that this is the right way because it is the semantics that is
	expected by the vast majority of senders in the system.

There are 112 senders of #respondsTo: in Squeak 3.6g2; if it is true that
"(the human authors of) the vast majority of senders (of #respondsTo:) in
the system" didn't understand the semantics of a basic operation which is
described extremely clearly in several classic textbooks, this is extremely
worrying.  But still, adopting (and correcting, if necessary) the
#honestlyRespondsTo: solution and fixing the broken code seems better than
changing the semantics of a basic operation.

	close
		"close if you can"
		(stream respondsTo: #close) ifTrue: [
				stream closed ifFalse: [stream close]]

	This method uses the message #respondsTo: (which is essentially the same
	as #canUnderstand:) in order to know whether the class of 'stream'
	acually offers the functionality #close and not whether the class of
	'stream' declares an abstract or disabled method #close!

Well, any code that passes in an object where an operation required by
the receiver is abstract *DESERVES* to break; in fact ought to break as
loudly and as early as possible.

	Therefore, the code immediately breaks if it is used for a stream class
	that either declares the method #close to be abstract (i.e., by
	implementing it with the body 'self subclassResponsibility') or declares
	it to be not appropriate (i.e., by implementing it with the body 'self
	shouldNotImplement').

And that's exactly what it SHOULD do.  If the operation is abstract,
it means that class is an abstract class.  There shouldn't _be_ any
instances.  If you have an instance of an abstract class, what you have
is an instance which doesn't know whether it supports a particular operation
or not.  Suppose, for example, we have a concrete stream object of a class
which leaves #close abstract (self subclassResponsibility).

Then there is *no* implementation of
	(stream respondsTo: #close)
	    ifTrue: [stream closed ifFalse: [stream close]]
which can do the right thing.  Why?  Because (stream) doesn't *know*
whether it responds to #close or not.  Yes, calling #close in this
case is wrong, BUT SO IS *NOT* CALLING #CLOSE

	This is really problematic because especially declaring a method as
	abstract is an extremely important concept of OO programming and it is
	therefore not acceptable that Smalltalk does not offer a programmer to
	do so without breaking the system.

I agree 100% that abstract classes and abstract methods are important
OO concepts.  I expect anyone who uses them to have made an honest attempt
to understand them.  A class which has at least one abstract method is an
abstract class.  And one of the basic rules about abstract classes is
DON'T CREATE INSTANCES OF ABSTRACT CLASSES.

If you think you have an application where this makes sense, what _really_
makes sense (at least in Smalltalk) is to create an application-specific
subclass of the abstract class where you have made your mind up about
which operations will be supported and how.

The real problem here is that Squeak doesn't actually _know_ when a class
is abstract, and doesn't notice when you create an instance of one.

Now, I do agree that if a class defines a selector by self shouldNotImplement    
then an instance of (a concrete subclass of) that class should not be regarded
as responding to the selector in question.  That's why I wrote
#honestlyRespondsTo:.  A class where you *have* made a conscious decision
"No I do not support this method" is different form a class where you have
decided not to decide yet.

	I don't know. If they do it the same way as it has been done in Squeak,

Modulo implementation details, they do.

	it is more than likely that also in these dialects, the semantics of
	#canUnderstand: and #respondsTo: does not correpsond to what is expected
	in practically all the users of these methods.

Quite likely.  This is why the better Smalltalk textbooks say "don't use
#respondsTo:, it doesn't do what you think it does."  The _real_ best thing
to do to Squeak is probably not to butcher #respondsTo: but to redesign
much of the code that uses it.

And of course the Lint checking part of RB should offer a warning about
uses of #respondsTo:/#canUnderstand:.  Maybe it does; I haven't checked.

	As a consequence, I assume that also these dialects do not allow
	a programmer to declare methods that should be implemented in
	subclasses as 'subclassResponsibility' without breaking other code.

I can't make sense of that.  They are just like Smalltalk:  you can define
any method you want to be self subclassResponsibility, but if you do that,
you then have an obligation NEVER TO CREATE A DIRECT INSTANCE OF SUCH A CLASS.

	Our goal is to have a kernel that allows a programmer to explicitly
	declare methods that should be implemented in a subclasses (i.e.,
	abstract methods) and methods that are not appropriate for a certain
	class without crashing the whole image.

We have that already.  The problem is not *declaring* methods that should
be implemented in subclasses, it is *creating direct instances of abstract
classes*.  The fix that is needed is some way for the system to know that
a class is an abstract class and for it to forbid attempts to directly
instantiate such classes.

Suppose that #respondsTo: is changed so that it answers false when you
ask about a selector for an abstract method.  (I repeat, this should not
be possible, because there shouldn't be any instances that _have_
abstract methods.)
- If it says (true), then this will typically result in the method being
  called, which means there will be a run-time error.
- If it says (false), then this will typically result in the method NOT
  being called, which is also wrong, because almost all #subclassResponsibility
  messages are refined by real implementations, not by #shouldNotImplement.

If in some particular case, the right reaction to an abstract method is
to ignore it, then in that case, the right way to define the method in
the first place was as a method that does nothing, *NOT* as abstract method.

	a) Leave everything as it is. This means that declaring a method as
	abstract or inappropriate for a class has nasty side-effects that break
	other code or even crash the image.

This is misleadingly stated.  There is no problem with declaring abstract
methods.  The problem is ignoring the gross error of creating a direct
instance of an abstract class.  (OK, OK, I'm a hacker too; it can be quite
useful for testing to create such an instance and run some tests on it.
However, this is precisely when you *DON'T* want #respondsTo: hiding calls
to unwritten methods.  If one of your test cases _should_ be calling #close
and that's not defined, you want to know about it.  More accurately, you
desperately *need* to know about it.)

	b) Change #canUnderstand: (and #respondsTo:) so that the semantics
	actually corresponds to what 99% of the the *current* users expect when
	they call it.

As noted repeatedly above, there is, and can be, *NO* fix to #respondsTo:
which will always give the right answer for code which is so blatantly
buggy as to provide an object with an abstract method to a receiver which
has a use for calling that method.

This is not an argument against revising #respondsTo: to say "no" when
presented with a #shouldNotImplement definition.

	c) Introduce new methods for #canUnderstand: and #respondsTo: and change
	practically all the users of #canUnderstand: and #respondsTo: so that
	they now use these new methods instead.

This is obviously the best way to proceed.
Why?
Because code that has been converted to use the new methods is code which
*has* been inspected to discover what the intended semantics actually is,
and code that has not been so converted is code that *hasn't* been converted
yet, so you *want* the run-time error as a way to tell you "you haven't
looked at this one yet."

	For us, a) is not acceptable because we do not want to have a kernel
	that does not allow clean OO programming.

I do not believe that there is any other OO community that would regard
creating direct instances of abstract classes as "clean OO programming".
All the statically checked OO languages I'm familiar with reject such
programs at compile time.  

In my (still unfinished, probably never to be finished) Smalltalk compiler,
I created new messages abstractSubclass:... for declaring abstract classes,
and didn't allow "self subclassResponsibility" in concrete classes.  One
reason was the dispatch method I was using; numbering only the concrete
classes helped.  But I certainly found it helpful to mark the distinction:
it simplified my code.  Squeak could tell abstract classes from concrete
classes without ceasing to be dynamic; it could be as simple as having a
"number of subclassResponsibility methods" variable in a class.