Temporary variables

Fri Jun 18 10:43:55 UTC 1999

Hi,

I appoligize in advance for the length of this email message as I know that
most of them are kept below 10 lines!!! This is the exception to the
guideline... ;--).

Here are some hard numbers from a production image of VSE smalltalk that a
client of mine has. 

As you can see the top 90.9133% of methods have ZERO or ONE temporary
variables! 

While methods with FIVE OR LESS temporary variables represent the top
98.6607% of all methods in the system! 

99.6291% of methods have TEN OR LESS temporary variables! The stats show
just how rare (0.3709%) the usage of lots of temporary variables are.

I think that this clearly shows that almost all the methods in the system
have 5 or less temporary variables. 

I think it also shows why people are "fervent" about this topic. 
Experienced Smalltalk users unconsciously know these stats just by browsing
the source code. 

I refute the notion that methods should have a small number of temporary
variables just because 98.6607% of the methods in the system do. This is a
silly notion. It's much more important to have a clearly written and well
factored method than to count the temporary variables.

43679 compiled methods in image.

35312 methods with 0 temps for 80.8443%.  The top 80.8443%.
4398 methods with 1 temps for 10.0689%.  The top 90.9133%.

1819 methods with 2 temps for 4.1645%.  The top 95.0777%.
848 methods with 3 temps for 1.9414%.  The top 97.0192%.
469 methods with 4 temps for 1.0737%.  The top 98.0929%.
248 methods with 5 temps for 0.5678%.  The top 98.6607%.

161 methods with 6 temps for 0.3686%.  The top 99.0293%.
87 methods with 7 temps for 0.1992%.  The top 99.2285%.
77 methods with 8 temps for 0.1763%.  The top 99.4047%.
59 methods with 9 temps for 0.1351%.  The top 99.5398%.
39 methods with 10 temps for 0.0893%.  The top 99.6291%.

29 methods with 11 temps for 0.0664%.  The top 99.6955%.
22 methods with 12 temps for 0.0504%.  The top 99.7459%.
16 methods with 13 temps for 0.0366%.  The top 99.7825%.
13 methods with 14 temps for 0.0298%.  The top 99.8123%.
9 methods with 15 temps for 0.0206%.  The top 99.8329%.

16 methods with 16 temps for 0.0366%.  The top 99.8695%.
4 methods with 17 temps for 0.0092%.  The top 99.8787%.
4 methods with 18 temps for 0.0092%.  The top 99.8878%.
5 methods with 19 temps for 0.0114%.  The top 99.8993%.
2 methods with 20 temps for 0.0046%.  The top 99.9038%.

3 methods with 21 temps for 0.0069%.  The top 99.9107%.
4 methods with 22 temps for 0.0092%.  The top 99.9199%.
7 methods with 23 temps for 0.0160%.  The top 99.9359%.
5 methods with 24 temps for 0.0114%.  The top 99.9473%.

3 methods with 26 temps for 0.0069%.  The top 99.9542%.
4 methods with 27 temps for 0.0092%.  The top 99.9634%.
1 methods with 28 temps for 0.0023%.  The top 99.9657%.
3 methods with 29 temps for 0.0069%.  The top 99.9725%.
CFBuildSpans>>computeSegmentLengthFactors
CFBuildSegments>>lastParallelSegmentFrom:distance:

2 methods with 30 temps for 0.0046%.  The top 99.9771%.
CFOutputGeometryForAutoCadWindow>>createViews

1 methods with 31 temps for 0.0023%.  The top 99.9794%.
WBLayoutWizard>>createViews

1 methods with 33 temps for 0.0023%.  The top 99.9817%.

1 methods with 37 temps for 0.0023%.  The top 99.9840%.
WBColumnarListBoxEditor>>createViews

2 methods with 38 temps for 0.0046%.  The top 99.9886%.

1 methods with 39 temps for 0.0023%.  The top 99.9908%.
CFPreferencesPane>>addSubpanes

1 methods with 41 temps for 0.0023%.  The top 99.9931%.
CFControlPointsPane>>addSubpanes

1 methods with 45 temps for 0.0023%.  The top 99.9954%.
PSWBPartsTableColumnEditor

1 methods with 46 temps for 0.0023%.  The top 99.9977%.
WBTableColumnEditor>>createViews

1 methods with 48 temps for 0.0023%.  The top 100.0000%.
WBTablePaneEditor>>createViews

I plan to investigate the reason that methods in the image in question have
more than 5 temporaries and why a handfull have more than 30! (I've
annotated some of the methods above and they are as I thought - window view
creation methods and calculation or initialization type methods for the
most part. MORE than half of the methods with greater than 10 temporary
variables are in the base Smalltalk image. The other half are in my
client's engineering application).

I really think that method size and complexity is a more important issue
that the number of temporaries in a method. The names you choose for the
variables is also more important than the number of them. Often what a
large number of variables illustrates is a potential opportunity to
refactor the method. But it may be fine as it is. The number of temporaries
is really a non-issue in comparison.

My primary goal is to keep the size of the method down to under 10-20-30
lines. I like to spread lines vertically and split expressions onto their
own lines and use temporaries to hold the values of these "split"
expressions. It's a form of literate programming where you document the
method with temporary variable names as well as comments and good method
names.

In window creation methods for windows that have lots of screen view
components you need to have lots of temporaries for holding the view
instances as you create them and then configure them.

Sometimes I don't want to break apart a long method because it is better to
have it all in one place to show the whole sequence in one place.

I just don't like people telling me that "it is required to meet our style
rules". I prefer to think of them as guidelines that have exceptions. I
gather this discussion is now centering around coming up with the
exceptions to the guidelines of a small number of temporaries.

I would rather have a large number of temporaries if they claify the method
then having short concise code that is obfucated by just a few variables
and lots of expressions in "( )".

For example it's common in Smalltalk to have "sub expressions" that only
need to be passed in as parameters to other message sends:

	| anObject1 anObject3 |
	anObject1 := SomeObject new.
	anObject3 := anObject1 
		someMulti: 100 
		keyword: 'hi' 
		message: (
			SomeObject new 
				someOther: (anObject1 length * 10) 
				keywordMessage: 80
	).

I will frequently rewrite this in the following way by adding a temporary
variable for the brackted "sub expression":

	| anObject1 anObject2 anObject3 |
	anObject1 := SomeObject new.
	anObject2 := SomeObject new 
		someOther: (anObject1 length * 10) 
		message: 80.
	anObject3 := anObject1 
		someMulti: 100 
		keyword: 'hi' 
		message: anObject2.

I find that this is much clearer than the first version particularily if I
choose a good name for the new temporary that is descriptive of the kind of
object it holds. Rewriting it as follows:

	| anObject1 anObject2String anObject3 |
	anObject1 := SomeObject new.
	anObject2String := SomeObject new 
		someOther: (anObject1 length * 10)
		message: 80.
	anObject3 := anObject1 
		someMulti: 100 
		keyword: 'hi' 
		message: anObject2String.

If I have a lengthly method with lots of "sub expressions" that I convert
to their own lines with the results stored in a temporary then the number
of temporaries can grow beyond 5, 10, 15 and in RARE cases 20+.

Often I'll take "sub expressions" and make them their own messages as well.
This can reduce the size of methods and enable reuse of code fragements.

Very often I'll find that the "sub expression" may return an different
object than normal, such as "nil" when an error is detected. In this case
since I've split out the "sub expression" into it's own line I can put the
conditional logic, "ifNil:ifNotNil:", in rapidly.

Sometimes I'll find that the "sub expression" that I've split out into it's
own line is reused later in the method. I've already got it out. This is
most common when working on the method to start with and when extending
methods.

Furthermore, this "temporaries for sub expressions" usage allows me to
quickly click on the temporary variable in the debugger and see the value
or inspect it without having to "execute the expression" which could cause
a side effect.

Also, I'll sometimes add extra round bracktes, "(..)", to claify the
execution order for faster understanding of the method.

Often I won't reuse temporaries in the method. I'll just use another one.
Why not? The only reason why not is if the objects being held onto during
the method execution are going to cause memory overflows if your image is
close to it's memory limits. The another only reason why not is if you hit
the limit of temporaries of the Smalltalk your using. But, as long as your
usage of temporaries makes things clearer for you and others who maintain
your code and if it helps in debugging I say go for it as needed. An
exception to the rule of keeping your methods short.

I really don't see anything wrong with Pierre Roy's long method
(following). It is large for sure but it does have a full sequence that may
be best left intact for clairty of understanding. Adding a dictionary to
hold all the pitches would potential obfuscate it with lots of "(pitchDict
at: #la put: (...))" and "(pitchDict at: #la)" type of code. Lots more
round brackets would flood the method, "( )", obfuscating the simple
clairty .

I've included a method that I've written that is used in production code.
You can see them below Pierre Roy's method below.

Obviously I have a different style then some of you. This is fine. Your
style is fine. There is not any one right style.  I would hope that all his
methods are not this large and I would recommend that he see if he can
refactor the method in some way to shorten it but ultimately it's his call
and not mine. It's a judgement call for him to make and Smalltalk should
not enforce some arbirtary limit to satisfy the "temporary variable
police".

A nother concern that has not yet been mentioned is that sometimes you have
to write code fast and furoiusly and just get it done. The method could be
refactored to simplify it but TIME AND MONEY don't permit it. Reality of
commercial projects.

All the best,

Peter William Lount
http://www.smalltalk.org
peter at smalltalk.org

========================
Temporary Variable or Method Variable Limits
Squeak		 		31
VisualWorks 			255
Visual Smalltalk Enterprise 	> 60 ?

=========================
Pierre Roy's long method. It's long, but easily readable without lots of
unrelated crud (i.e. at:put:, at: for dictionary accessing). Maybe the
readability is more important than some rule about the number of temporary
variables? This is actually a not so rare initialization method that has a
lot to do and must keep the state in a variable or some other object. I
might suggest that this could be refactored into an object of it's own that
manages the "AllNotes" ordered collection instead of using a class/global
variable "AllNotes".

    initialize
     "There are 35 pitch classes. et oui !"
     "self initialize"

     | la si do re mi fa sol lad sid dod red mid fad sold lab sib dob reb
mib fab solb
    ladd sidd dodd redd midd fadd soldd labb sibb dobb rebb mibb fabb solbb
|
     la := (PitchClassNatural new) semiToneCount: 10; name: #A.
     si := (PitchClassNatural new) semiToneCount: 12; name: #B.
     do := (PitchClassNatural new) semiToneCount: 1; name: #C.
     re := (PitchClassNatural new) semiToneCount: 3; name: #D.
     mi := (PitchClassNatural new) semiToneCount: 5; name: #E.
     fa := (PitchClassNatural new) semiToneCount: 6; name: #F.
     sol := (PitchClassNatural new) semiToneCount: 8; name: #G.

     lad := PitchClassSharp new natural: la.
     sid := PitchClassSharp new natural: si.
     dod := PitchClassSharp new natural: do.
     red := PitchClassSharp new natural: re.
     mid := PitchClassSharp new natural: mi.
     fad := PitchClassSharp new natural: fa.
     sold := PitchClassSharp new natural: sol.
     lab := PitchClassFlat new natural: la.
     sib := PitchClassFlat new natural: si.
     dob := PitchClassFlat new natural: do.
     reb := PitchClassFlat new natural: re.
     mib := PitchClassFlat new natural: mi.
     fab := PitchClassFlat new natural: fa.
     solb := PitchClassFlat new natural: sol.
     ladd := PitchClassDoubleSharp new natural: la.
     sidd := PitchClassDoubleSharp new natural: si.
     dodd := PitchClassDoubleSharp new natural: do.
     redd := PitchClassDoubleSharp new natural: re.
     midd := PitchClassDoubleSharp new natural: mi.
     fadd := PitchClassDoubleSharp new natural: fa.
     soldd := PitchClassDoubleSharp new natural: sol.
     labb := PitchClassDoubleFlat new natural: la.
     sibb := PitchClassDoubleFlat new natural: si.
     dobb := PitchClassDoubleFlat new natural: do.
     rebb := PitchClassDoubleFlat new natural: re.
     mibb := PitchClassDoubleFlat new natural: mi.
     fabb := PitchClassDoubleFlat new natural: fa.
     solbb := PitchClassDoubleFlat new natural: sol.

     la following: si; preceding: sol; sharp: lad; flat: lab.
     si following: do; preceding: la; sharp: sid; flat: sib.
     do following: re; preceding: si; sharp: dod; flat: dob.
     re following: mi; preceding: do; sharp: red; flat: reb.
     mi following: fa; preceding: re; sharp: mid; flat: mib.
     fa following: sol; preceding: mi; sharp: fad; flat: fab.
     sol following: la; preceding: fa; sharp: sold; flat: solb.

     lad sharp: ladd.
     sid sharp: sidd.
     dod sharp: dodd.
     red sharp: redd.
     mid sharp: midd.
     fad sharp: fadd.
     sold sharp: soldd.
     lab flat: labb.
     sib flat: sibb.
     dob flat: dobb.
     reb flat: rebb.
     mib flat: mibb.
     fab flat: fabb.
     solb flat: solbb.

     A := la. B := si. C := do. D := re. E := mi. F := fa. G := sol.

    AllNotes := OrderedCollection new.
    AllNotes add: do; add: re; add: mi; add: fa; add: sol; add: la; add:
si; add: dod;
    add: red; add: mid; add: fad; add: sold; add: lad; add: sid; add: dob;
add: reb; add:
    mib; add: fab; add: solb; add: lab; add: sib; add: dobb; add: rebb;
add: mibb; add:
    fabb; add: solbb; add: labb; add: sibb; add: dodd; add: redd; add:
midd; add: fadd;
    add: soldd; add: ladd; add: sidd.

     self initializeFrenchNames.
     English := true.
     self initializeAllNaturalNotes;  initializeGlobals

=====
Some methods by Peter Lount that "violate the style guides"!

BezierCurve instance method.

displayOn: theBitmap 
steps: theNumberOfSteps
preTranslateBy: thePreTranslatePoint
scaleBy: theScalePoint
postTranslateBy: thePostTranslatePoint

	| aPen aListOfPointsOnBezierCurve aFractionParameter aPointOnBezierCurve 
	aListOfIntegerControlPoints aListOfIntegerPoints |

	aListOfPointsOnBezierCurve := OrderedCollection new.

	0 to: theNumberOfSteps do: [:anIndex |
		aFractionParameter := (anIndex / theNumberOfSteps) asFloat.
		aPointOnBezierCurve := self computePointOnBezierCurveAt:
aFractionParameter.
		aListOfPointsOnBezierCurve add: aPointOnBezierCurve.
	].

	aPen := theBitmap pen.
	aListOfIntegerPoints := aListOfPointsOnBezierCurve collect: [:aPoint |
		(((aPoint + thePreTranslatePoint) * theScalePoint)  +
thePostTranslatePoint) rounded
	].

	aPen foreColor: Color black.
	aPen polyline: aListOfIntegerPoints.

	aListOfIntegerControlPoints := self integerControlPoints collect: [:aPoint
|
		(((aPoint + thePreTranslatePoint) * theScalePoint) +
thePostTranslatePoint) rounded 
	].

	aPen foreColor: Color darkRed.
	aPen polyline: aListOfIntegerControlPoints.
	aListOfIntegerControlPoints do: [:aPoint |
		aPen place: aPoint.
		aPen circleFilled: 3
	].

	^aListOfPointsOnBezierCurve

=====
"Workspace that calculates the temporary variable stats... for
VisualSmalltalkEnterprise."

    | aTempCount aDict aNumberOfCMWithCount aNumberOfCMInImage
aCMInImageCount
    aTopPercent aPercentOfTotal aCurrentCMList aHighCountList |
    aDict := Dictionary new.
    aCMList := CompiledMethod allInstances.
    aCurrentCMList := aCMList select: [:aCM |
        (aCM classField compiledMethodAt: aCM selector) == aCM
    ].
    aHighCountList := SortedCollection sortBlock: [:a :b |
        a printString <= (b printString)
    ].
    aCurrentCMList do: [:aCM |
        aTempCount := aCM tempCount.
        aTempCount > 10 ifTrue: [
            aHighCountList add: (aCM printString, ' has ',
                aTempCount printString, ' temporary variables.')
        ].
        aNumberOfCMWithCount := aDict
            at: aTempCount
            ifAbsent: [0].
        aDict at: aCM tempCount put: aNumberOfCMWithCount + 1.
    ].
    aHighCountList do: [:aCMString |
        Transcript
                show: aCMString;cr.
    ].
    "Produce a report in sorted sequence..."
    aCMInImageCount := aCurrentCMList size.
    aTopPercent := 0.
    Transcript show:
        aCMInImageCount printString,
        ' compiled methods in image.'; cr.
    aDict keys asSortedCollection do: [:aTempCountKey |
        aNumberOfCMWithCount := aDict at: aTempCountKey.
        aPercentOfTotal :=  ((aNumberOfCMWithCount / aCMInImageCount) *
100) asFloat.
        aTopPercent := aTopPercent + aPercentOfTotal.
        Transcript show:
            aNumberOfCMWithCount printString, ' methods with ',
            aTempCountKey printString, ' temps for ',
            aPercentOfTotal asFloat printFour, '%. ',
            ' The top ', aTopPercent printFour, '%.';
            cr.
    ].