About KCP and automatic initialize

Richard A. O'Keefe ok at cs.otago.ac.nz
Tue Sep 16 02:49:00 UTC 2003


"Andreas Raab" <andreas.raab at gmx.de> wrote:
	Much of what [Ned was] saying comes down to a fundamental
	misunderstanding on your part - it is that the new/initialize
	pattern is the _only_ way to initialize an object and that (once
	you have created an object) it must be "properly initialized by
	default".

I have certainly thought that the proposal (going back to 1999 at least)
means any such thing.

However, it would be dishonest to pretend that the method name
#initialize doesn't *suggest* to non-paranoid readers that its job
is to properly initialize an object.

I can only too easily imagine the books and tutorials that might be
written:
    "Object>>new calls #initialize so that when you create an object
    using #new it is always initialized."

Let's be really honest here:  I have used the proposed pattern myself
in some projects.  BUT the name I've given to the automatically called
method is #postNew, *not* initialize, and it goes with a corresponding
#postCopy method called by #shallowCopy.

	This is not the goal of the pattern.  It is a convenient
	short-hand notation which seems applicable in enough cases to
	make it generally available.
	
I have frequently acknowledged the seeming naturalness of the pattern.
What I'm still waiting for is some acknowledgement by the proponents
that the combination of (1) a misleading name and (2) automatic
ubiquitous presence introduces very real risks.

	Perhaps it is worthwhile to highlight the major advantages of
	the pattern again (much of this got lost over specific details
	which - while important - don't really address the intent of the
	pattern):

	a) Convenience

	The pattern is convenient, as you can rely on the fact that if
	you provide a default initializer it gets called.

It is precisely the seductive convenience which troubles me,
and the idea that such a thing as a default initializer makes sense
>most< of the time.

I started dipping into the system classes pretty much at random.

CompositeTransform
    - doesn't have an #initialize method
    - instances should be created using 
      CompositeTransform class>>globalTransform:localTransform:
      CompositeTransform class>>fromRemoteCanvasEncoding:

    This is the very first class I stumbled across, and clients
    really should NOT call #new on it.  This is exactly one of the
    classes which should have
	new
	  self shouldNotImplement
	globalTransform: gt localTransform: lt
          ^self basicNew globalTransform: gt localTransform: lt

    In fact the banning of #new should be in the parent class
    DisplayTransform.

MIDIFileReader
    - doesn't have an #initialize method
    - instances should be created using, well, they just shouldn't.
      The class methods are generally
      .... (self new readMIDIFrom: aBinaryStream) asScore ....

    This is the very second class I stumbled across.  Once again,
    clients should *NOT* call #new on it.  No "default initializer"
    would be of any use at all; initialisation requires a stream
    argument (#readMIDIFrom:  _is_ the initialiser).

PitchBendEvent
    - doesn't have an #initialize method
    - is a textbook case of the bad pattern I want Squeak to make
      *harder* and the proponents of the #new->#initialize pattern
      want to make fatally *easier*.  PitchBendEvent new returns
      an *un*initialised object.
    - No sensible no-argument initialiser is possible.  The actual
      initialiser is #bend:channel:.  Instead of the current use,
      PitchBendEvent new bend: x channel: y
      there should be an instance creation method so that one uses
      PitchBendEvent bend: x channel: y
    - Once a sensible instance creation method is provided,
      PitchBendEvent>>new should be self shouldNotImplement.

I think three classes is enough to demonstrate my point.
In each case, I chose a class I had *never* looked at before
in an area of Squeak I had *never* looked at before.  And in each
case, it made NO sense for ordinary clients to call #new; in each
case MORE information was required for initialisation than a
no-argument initialiser could have ready access to.

It defies belief that a pattern which showed up every time I looked
could be rare.  The pattern is "No #'new's is good news", and it is
a common one.

What the heck, let's try another class, again one that I have never
looked at before, in an area of Squeak I have never looked at before.

ListItemWrapper
    - there are two instance variables
    - there is no #initialize method
    - clients should create instances using
	ListItemWrapper with: anItem
	ListItemWrapper with: anItem model: aModel
      *not* by sending #new.

    Once again (and this is getting monotonous), the class should have
      new
        self shouldNotImplement.
      with: anItem
        ^self with: anItem model: nil
      with: anItem model: aModel
        ^self basicNew setItem: anItem model: aModel

I'm finding "No #'new's is good news" wherever I look.

The following points appear to be beyond dispute:

    (1) Stretchy containers CAN and SHOULD have a #new method
        which answers a new empty but extensible container.

    (2) Precisely *because* such containers have a use for default
        initialisation, it would be dangerous for them to automatically
        inherit an #initialize which does nothing, because that could
        far too easily conceal from the programmer the fact that the
        necessary method is missing.
        
	(While OrderedCollection _could_ use a no-argument initialiser
	 to implement #new, it so happens that it doesn't.

    (3) Because streams (at least ANSI streams) support a #close method,
        it makes sense to have a #new method in the class which answers
        a new stream in a closed state.  (Although this is really only
        a good idea if there is some kind of #reOpenOn: message.)

    (4) Precisely *because* such streams have a use for default
        initialisation, it would be dangerous for them to automatically
        inherit an #initialize which does nothing, because that could
        far too easily conceal from the programmer the fact that the
        necessary method is missing.

    (5) There are many classes where you need to supply some information
        when you create an object.  Clients of these classes in no way
        benefit from their inherited #new, and in fact it would be very
        useful if they were in some way prevented from calling #new on
        such classes.  Because initialisation requires added information,
        an #initialize method is of no use to such classes.

    (6) There are many methods in the system which *DO* use
        super new initialize
        In Squeak 3.5 patch level 5180, there were *APPROXIMATELY*

        90 "super new initialize"                  in V3.sources
         5 "super new initXXX" (other XXX)	   in V3.sources
        19 "super new initXXX: x"                  in V3.sources
         4 "super new initXXX: x yYY: y ..."       in V3.sources

        30 "super new initialize"		   in 3.5-5180.changes
         1 "super new initXXX" (other XXX)	   in 3.5-5180.changes
        12 "super new initXXX: x"                  in 3.5-5180.changes
         7 "super new initXXX: x yYY: y ..."       in 3.5-5180.changes

	That is, 48 "super new init{something}" methods which would NOT
	be helped by #new calling #initialize automatically, and 120
	which MIGHT.  But to put that 120 into perspective,
	Object allSubclasses size => 3762
	Also, many (in fact, *most*) of the "super new initialize" calls
	that I have investigated could just as well have been
        "self basicNew initialize".

        I don't have any precise figures here, but based on my examination
        of Squeak 3.5-5180, the proportion of classes which could be in any
        way simplified by having #new automatically call #initialize is
        about one class in 60.

I make no claim that the classes I checked under point (6) that *do*
use the "super new initialize" pattern; on the contrary, it seems that
quite a few of them *shouldn't*.  What does seem clear (based on sampling,
not exhaustive survey) is that making it *HARDER* to use #new would have
a positive effect on code quality, while making it *EASIER* would have
very low payoff.

	The pattern is already a de-facto standard as pretty much every
	class that implements an (instance) side #initialize method
	expects this to be the default initializer.  The pattern merely
	recognizes this.

The vast bulk of #initialize methods are either class initialisation
or part of the Morph protocol and are therefore not affected by the
#new->#initialize proposal.  (In the first case because they are not
instance methods, in the second case because the method is already
called automatically.)

Of the 233 #initialize methods that remain, the very first one that
I looked at (in XMLDomParser, SAXhandler) was yet *another* instance
of the "No #'new's is good news" pattern; the effect of defining #new
this way was to make an object on which #initialize had been called
but which was NOT sufficiently initialised to be usable.

This is getting beyond a joke:  every time I look seriously at another
class that I haven't looked at before, I find one which *shouldn't*
be making a parameterless #new available to clients.

What we need is a pattern which recognises THIS fact.

	b) Fault tolerance
	How often have you written an (instance) initialize method, used
	it (out of lazyness) as in "FooClass new initialize" only to
	recognize at some point that you really (once more) should
	implement the new/initialize pattern and got bitten by the
	double initialize?

Never in the last couple of years.  You see, I have learned to regard
"FooClass new" as suspicious in and of itself.  I have *especially*
learned to be wary, not to say vigilantly trepidant, of #initialize
methods.

The really serious problem here is not "double initialize" (which should
usually be harmless) but "parameterless #initialize" (which is usually
wrong).

	I, for myself, more often than I dare to admit and Squeak itself
	does contain "double initializers" for exactly this reason.
	Introducing the pattern simply avoids those kinds of mistakes.

That may very well reduce *one* kind of fault, but it does so by
"legitimating" a rather more common and much more serious fault.
The nett effect really isn't fault "tolerance" but fault ENCOURAGEMENT.

	c) Does not prevent other forms of initialization

Did anybody say it did?

	It _does_ implicitly favour the pattern (as it is more work to
	provide a different form of initialization rather than using the
	one provided for you) but this argument would only be
	problematic if the new/initialize pattern is the _only_ thing
	you know about object initialization.

The argument here is invalid.  Implicitly favouring parameterless
#new and parameterless #initialize is problematic NOT because of what
anyone does or doesn't know, but because it is almost always the wrong
way to initialise and the wrong way to create an instance.

	d) It allows controlled exposure to the meta layer

Which would be just wonderful IF it weren't designed to encourage
bad programming practice.  I don't don't disagree with a single thing
here about controlled exposure to power.  What I disagree with is the
idea that implicitly leading people into a trap and then showing them
how to get out of it is in any way better than NOT leading them into
the trap in the first place, and providing controlled exposure to the
meta layer some other way.

	The pattern does not solve issues of classes requiring specific kinds of
	initialization (such as TaxRule, FileStream, Socket etc) and it is not
	intended for it. It's not a "one size fits all" solution and it is not
	intended as such.
	
But the #new-calls-#initialize hack *IS* a "one size forced on all"
solution.  You have to take positive steps to avoid it.

	It would be nice if we could get the discussion back on these issues rather
	than talking about specific cases which the pattern is explicitly not
	designed to address.

We really aren't talking about a pattern.  We're talking about a change
to the kernel classes (specifically Object) with the effect of forcing
certain new behaviour on *every* class where it isn't specifically
cancelled, whether this new behaviour is appropriate or not.

Or if you want to insist on using "pattern" terminology, we're talking
about a change to make an *anti*-pattern the default behaviour of Squeak.



More information about the Squeak-dev mailing list