** Original Sender: Helge Horch Helge.Horch@munich.netsurf.de
At 18:55 30.07.99 -0400, Chris Norton wrote:
I will admit to having used i-vars directly in the past. At the time, it seemed like a good way to ensure privacy, but upon reflection I decided that this was silly. Any developer can just add his/her own accessor. So now my practice is to create accessors, but label them as "private" and put them into "private" method categories. Since Smalltalk does not enforce privacy, we have to assume that our fellows will adhere to our "code of conduct".
</Soapbox>
<DÈjý vu> I like the way Kent Beck reasons about it. In [SBPP] he lists *both* "Direct Variable Access" (p.89) and "Indirect Variable Access" (p.91) as valid coding patterns, alluding to the obvious schizophrenia, and mentioning encapsulation and dogmatism. ([SWS] is leaning towards indirect-only, IIRC.) In his recent [GTBS], featuring a reprint of his 1993 SmalltalkReport column "To Accessor or Not to Accessor", I found his leading remarks from today's view most interesting: "[...] It wasn't until I rewrote the while thing as patterns for the book [SBPP] that I realized the key issue here is communication." (The column itself stresses the importance of consistency in one's ways.) His conclusive remark: "Anyway, if this one bugs you, ignore it, all except the part about making accessors private by default."
Helge:
I am very glad you posted the Kent Beck quotes. I was fully intending to do the same, but for once I remembered to read to the end of the message list before prematurely responding.
All:
I fully agree with KB's position (squarely on the fence, judging each case on its merits). But I would like to add the following observations:
1. Finding all references to an instance variable in a class hierarchy is even easier than finding all senders of some message, because there is no ambiguity with respect to which class implements the name (when renaming methods, care must be taken to only rename those message sends where the receiver will be an instance of a class where the method was renamed). Since this is so, changing the name or structure of the slots of an object is easier than changing its interface. A design change that would require renaming or removing an instance variable would very likely also require renaming or removing the associated accessor and mutator methods. So the debate centers on those few cases where the accessors/mutators would remain, but the instance variable(s) would not (e.g., the Point example already mentioned).
2. All classes in Smalltalk can be changed by any programmer. If an instance variable offends thee, thou mayest rename or remove it. Same goes for methods. Similarly, any programmer can add methods to a class to access otherwise encapsulated instance varibles--and can also use #instVarAt: and #instVarAt:put: to defeat any message-based encapsulation of an instance variable. However, this does not mean that one should give up on the attempt to use method-mediated encapsulation of state in order to enforce invariants and/or constraints. If object encapsulation had no value, then OOP would be pointless.
3. Barbara Liskov, in her speech at OOPSLA '87 in Orlando, noted that Smalltalk-style class inheritance (where any subclass has full access to the internals of its superclasses) breaks encapsulation. The issue here is that one source may provide a framework class meant to be subclassed, another programmer may create a subclass, which then breaks when the code is ported to a new version of the framework where one or more instance variables (for example) have been renamed or removed (perhaps due to a design change, such as converting Point from cartesian to polar). This is where the issue of direct access to instance variables may bite the hardest. On the other hand, the same issue is present with respect to methods, whether public or private.
4. The Point example is actually bogus. The right approach is to have an abstract superclass with CartesianPoint and PolarPoint subclasses. There is no valid argument for defending against some future wholesale replacement of the x and y instance variables of the CartesianPoint class. Those instance variables would inherently have to be there. One might want to rename them, but the same issue would exist with respect to accessor/mutator methods (only more strongly, given point #1 above). The only problem (unfortunately) is that a solution analogous to the one for Point/CartesianPoint/PolarPoint may not be applicable in all cases.
5. Beware lazy initialization. It has valid uses, but there are also invalid ones, and traps for the unwary.
The best use of lazy initialization is for "caching" variables. A caching variable holds a value that can be recomputed from other data at any time. If lazy initialization is used to compute then cache the value of a variable, then setting the variable to nil at some random moment should not change the behavior of the program (other than time to execute). If setting the variable to nil at any random moment would be incorrect, then the variable is NOT a caching variable.
Another case where lazy initialization is a good thing is class variables, but only because of the likelyhood that a class will be filed in without the #initialize message being sent to it (it's amazing how often this seems to happen).
Finally, lazy initialization is a necessary technique for solving certain transitive closure and resource usage problems (where fully initializing everything either never ends, or does way more work and/or resource allocation than is actually necessary).
One case where one would not use lazy initialization is for setting the value of "identity variables," which serve to specify the identity of a domain object in its domain. "Identity" variables are usually set once (usually when the object is first instantiated), and then never changed. An identity variable must be explicitly set, usually to a value specified by some source external to the object (e.g, the primary key of a domain object). By their very nature, they should not be lazily initialized. Using lazy initialization to set the social security number of aPerson just makes no sense, so this mistake is not common. In fact, one can use the fact that there is no good default value for a variable to spot an identity variable.
State variables present a more difficult case. A "state" variable is one that holds some changeable state of an object, such as the mailing address of a Person. Lazy initialization of "state" variables seems to work fine, until you need to return an object to its initial and/or default state. One would like to do this by sending the object the message #initialize, but it often turns out that a) there is no such message, because lazy initialization was used exclusively, or b) there is such a message, but for a variety of reasons it does not put the object into the desired state (either because it was never fully implemented, or because it was not kept in sync with the lazy initialization system).
Another problem is debugging. When state variables are lazily initialized, then one can easily come to false conclusions about object state during a debugging session, when "nil" doesn't really mean "nil".
But the worst problem arises when lazy initialization causes a transitive closure and/or resource usage problem, instead of helping to solve one. This is classic: You religiously send accessor messages to fetch state. The accessors use lazy initialization, and gleefully create unneeded objects. You asked for the object, there wasn't one there, so it got created and retured to you by the oh-so- helpful accessor (not!). My favorite is #release code that reinitializes an instance variable just so it can send #release to its value (e.g, "self controller release"). Don't go there. (And this is one case where it is probably better to access the instance variable directly).
For these reasons, it is better to have an #initialize method that sets all state variables to their default values, sets all caching variables to nil, and leaves all identity variables unchanged. The #new method of the root class in the hierarchy should then answer "self basicNew initialize" (#basicNew instead of "super new," because of the posibility that the superclass may someday be doing the same thing!).
6. I like the way that Self deals with this issue. Very elegant. Kudos to David Ungar, et al. But I must caution that Smalltalk is not Self, and that always using accessors and mutators to access state does not change Smalltalk into Self, and does not provide the same benefits Until we change the language, we have to deal with what we have, not with what we wish we had.
7. Whether Andres was wrong or right about direct instance variable access, it was not cool to come down on him like a ton of bricks. Andres, on the other hand, should have simply questioned why the instance variables were not being accessed directly, instead of boldly asserting they should be.
We need to treat each other with respect and diplomacy. We are ambassadors for Smalltalk, and should act accordingly when communicating in an open forum, one of whose purposes is spreading the Smalltalk gospel.
Andres: you stepped on a hornets nest with this issue. Don't let it bother you. Score it as a learning experience, and be assured it's not the only subject that may cause a religious war to erupt among Smalltalkers.
8. Given the above, I would avoid being dogmatic on this issue. I think the fact that so many good Smalltalk programmers still access instance variables directly, and that code that commits this "sin" has lasted unchanged for so long, should cause one to at least question whether or not it's all that sinful.
I think this is a good issue for the language designers to deal with. Those just trying to code in the Smalltalk of today should consider the pros and cons, formulate a consistent strategy, and then stick with it.
9. I really hadn't intended to write such a long sermon, but the issue is complex and non-trivial.
Have a good weekend!
--Alan
Note (might as well make my allegiences clear): What I *like* about accessors is that it makes variable access a matter of message passing. Assignment is one of the few places where Smalltalk departs from the message passing paradigm (contrast with loops, conditionals, etc.). I wonder how much of the "ugliness" of using accessors for variable access simply comes from the fact that it's possible to do it with assignment. It would be have been interesting, though unfortunate, if, historically, there had been gramar level conditionals as well as the message varients in the langauge. I wonder if we'd be arguing about *them* too! :) (Side note 2: the self atVariableIndex: foo put: bar takes care of it, of course.)
What I don't like about accessors is that they can be inconvenient to use and manipulate, as things stand. Auto generation/deletion alone doesn't cut it :) I suspect that as long as there's a firm and obvious distinction between variables and methods, folks with tend to *think* in terms of variables vs. methods, and *that* (I think) makes direct variable access tempting.
Hmm. Now I'm curious about temp variables. :) Restricting assignment to temporary variables would make the temp varible/object variable distinction clear and allow some conveniences. Hmm.
At 5:15 AM -0400 7/31/99, Alan Lovejoy wrote:
[snip]
- Finding all references to an instance variable in a class hierarchy is
even easier than finding all senders of some message, because there
*Really*? What's the key command. In Squeak it's command-N.
[snip]
- All classes in Smalltalk can be changed by any programmer. If an
instance variable offends thee, thou mayest rename or remove it. Same goes for methods.
True, but isn't a lot of this a matter of convenience? Absent the requisite tool, I'd much rather tweak an accessor than plod through all the methods in search of assignments.
The best requisite tool, in this case, is, I suspect, "someone else" :)
Similarly, any programmer can add methods to a class to access otherwise encapsulated instance varibles--and can also use #instVarAt: and #instVarAt:put: to defeat any message-based encapsulation of an instance variable. However, this does not mean that one should give up on the attempt to use method-mediated encapsulation of state in order to enforce invariants and/or constraints. If object encapsulation had no value, then OOP would be pointless.
<sigh/> I just want to say that I think "encapsulation" is a really, really messy concept. And that most varients are *not* essential properties of OO systems (Kent Pitman has a very nice article on this.)
[snip]
- Whether Andres was wrong or right about direct
instance variable access, it was not cool to come down on him like a ton of bricks. Andres, on the other hand, should have simply questioned why the instance variables were not being accessed directly, instead of boldly asserting they should be.
My main problems was the notion that we *shouldn't* use accessors in order to save message sends. *That's* dangerous advice. It's akin to "use big methods to avoid message sends", or "be aware of the messages which, in certain classes, are compiles to bytecodes, lik #> on Integer" (note how the advice has to be framed!). In Python, where the cost of a message send nee method invokation nee function call is exorbinant, big functions/methods (or binding the function to a local) *is* standard--and often good--advice.
But Smalltalk isn't Python.
We need to treat each other with respect and diplomacy. We are ambassadors for Smalltalk, and should act accordingly
I'm not! I wanted to be ambassador to France but couldn't pony up the donation. In any case, I have been Smalltalkingly long enough to be more than a "permanant resident" with a slighly wierd visa. Maybe I'll get naturalized soon!
when communicating in an open forum, one of whose purposes is spreading the Smalltalk gospel.
Eh, I think you're confusing this list with the "EvangeSqueak" list :)
[snip]
- Given the above, I would avoid being dogmatic on this
issue. I think the fact that so many good Smalltalk programmers still access instance variables directly, and that code that commits this "sin" has lasted unchanged for so long, should cause one to at least question whether or not it's all that sinful.
But do you agree that doing so for the sake of performance without determining whether it *is* a performance issue in general, or even for the particular method, is wrongheaded?
[snip]
Cheers, Bijan Parsia.
Hi.
- Whether Andres was wrong or right about direct
instance variable access, it was not cool to come down on him like a ton of bricks. Andres, on the other hand, should have simply questioned why the instance variables were not being accessed directly, instead of boldly asserting they should be.
I'd like to clarify something... the subject of my original mail is "[Interval Problem] my 2 cents". I was hoping that my assertion would be clearly restricted to the Interval fix being discussed. I made the assertion based on the style already used in the coding of Interval, that is, no accessors. Then... is this a bold assertion?
"I think it would be a good idea to remove the self sends..."
I'd rather say that a bold assertion would be more like
"Self sends must be removed..."
but I could be wrong since English is my second language.
Andres.
On Sun 01 Aug, Andres Valloud wrote:
but I could be wrong since English is my second language.
As it is for most of the people on this list... :-)
squeak-dev@lists.squeakfoundation.org