Temporary variables

Wed Jun 16 23:28:05 UTC 1999

Hi,

The number of temporary variables in a method is an implementation detail
and limit. It seems to vary in different versions of Smalltalk. This limit
is not part of the language or the proposed Smalltalk ANSCII standard
(please correct me if I am wrong). Any limit choosen is purely arbitrary
(or required by an implementor).

I feel that Smalltalk should not impose a limit as you'll encounter
situations where you'll need just one or two more than the limit just when
you have no time to refactor the method. Isn't this usually the case with
limits of various types? Like variable name lengh. Realistically though an
unlimited number of temporary variables is not going to be very high on an
Smalltalk virtual machine implementors prority list...

That said, it's very rare to have a method that really needs more than 25
variables, however it does happen. Re-organizing the code so that it uses
less temparory variables is the obvious thing to do. However, sometimes
people write very long methods with require lots of temporaries and they
are not easily or quickly refactored. 

I personally discourage long methods. A rule of thumb being if the method
is longer than 10-20 lines (depending on pretty printing) it should be
factored into multiple methods. This helps me avoid having too many
temporaries. Sometimes this is difficult or time consuming to do. Maybe the
code was written by someone with less "concern" for short, neat, tidy, and
clean coding practices. 

I came across this type of situation recently when porting code from one
Smalltalk to another. The newer Smalltalk has a lower limit on the number
of temporary variables (the method had one or two more temporary variables
than the new Smalltalk permitted) and it was a pain to port these methods.
Instead of just filing them in and dealing with them later I had to rewrite
them just to get them to compile. Anoying. 

I often write short lines of smalltalk code and use lots of temporary
variables just to make the code easier to read. I often have many more
temporaries than 10 as a result (< 5 is normal, 5 - 10 is less common,
11-15 is not that uncommon, > 15 is unusual, >25 is rare but I have
enountered it). The code could easily be "obfuscated" to reduce the number
of temporaries but I prefer to write clear methods at the expense of extra
temporaries. The extra temporaries allow for easier debugging and
understanding of the code. The extra "clearity" gained in reading the
method is important for people who maintain the code after you have done
with it. I think it's a practice called "Literate Programming" invented (or
described) by Donald Knuth in a book by that very name.

I aim to reduce the size of large methods by factoring the component
expressions into other methods or even into new classes of objects. 

Sometimes a method is just trying to do too much in a single place. In this
case it's easy to refactor it into multiple methods and quite often you
gain flexibility in the object. Other times what the method needs to do is
just too complex a process with too much state needed to be remembered
between expressions and refactoring it into an object is suggested.

Often when I can't get a method down to a resonable size I consider if it
should be it's own object with instance variables instead of temporaries.
Complex methods usually work better when they are an object and have an
extra benefit - they can be reused outside of the original design context. 

An example is in order. Recently I was writing a Bezier Curve calculation
method and it was getting too big and complex so I refactored it into it's
own object. It's now reusable beyond the initial useage (the curve of a
roadway in an engineering application) and I've added additional methods
that were not in the original requirements (like displaying itself on a
bitmap). In addition as an object it now takes advantage of caching some of
the calculated values (in instance variables) to improve the performance on
subsequent message requests that it receives.

Another way to know if your method is getting a bit unwieldy is if it has
lots of parameters (or if when breaking it apart it the new factored
methods would have lots of parameters). This results when you have a
complex method that is implementing a complex process with lots of
calculations (i.e. engineering software) where you must have lots of
temporaries to maintain the "computational" state. Again this is an ideal
candidate for a new kind of object. 

Unfortunately, calculation intensive methods or complex methods can't
always use collections (of whatever kind) to come to the rescue. A
combination of refactoring techniques such as splitting the method,
reorganizing it into an object, rethinking how it's written by making use
of other existing objects, removing unnecessary temporaries (like my use of
temporaries for easier debugging), and other techniques can assist you in
making your code easier to create, debug, test, maintain, and reuse.

A similar problem exists for objects. Lets say that you took a overly
complex method with 25 temporaries and made it an object. If the object has
25 variables maybe it needs to be refactored. Then agian maybe not all the
25 variables would need to become instance variables as some of them would
work just fine as temporaries... It's just not so cut and dry.

I wrote a 3D matrix transformation class last summer for 3D graphics. It
has 16 instance variables to hold the 4x4 matrix of floating point numbers
and 1 instance variable for a flag. I am considering adding a few more
flags to improve the performance with optimizations in some obscure, but
sometimes frequently used situations. This is a lot of instance variables
in an object. But their use is justified for performance and space reasons.
The code just runs faster using this approach than a collection to hold the
matrix. 

I once had an object with 22 instance variables (it implemented a full
object oriented drag and drop in VisualWorks Smalltalk in 1992). This was a
lot of instance variables. I could only have reduced it by a few instance
variables as their were many calculations and I wanted to cache the values
for clairity, performance and debugging reasons. I took a lot of flak from
one of the original designers of Smalltalk (with whom I worked with at the
time). But the 22 variables were defensible and I could show why each an
every variable was needed. For the 2 or 3 variables that were "extraneous"
I defended them on the basis of clarity and debugging - I would not remove
them JUST to reduce the count of the variables as this would obfuscate the
object and make it harder to debug in operation. It's difficult to debug a
realtime user controlled process so having the extra information around
later helps understand how it's working or not working. Also the object was
already collaborating with other custom objects to support the objects
being dragged so it could not obviously be "refactored" into multiple
classes.

When using this many temporary or instance variables I highly recommend
that you take extra time and care to document the method or object. A clear
and complete explaination of the use of each variable can save someone a
lot of time in understanding the reason behind the large number of
variables. 

One of the key reasons to limit the size of the number of variables in a
method or object or the number of lines in a method is that studies of
humans reveal that people can hold 7+-2 (5 to 9) things in their conscious
mind at a time. This is why phone numbers are seven digits. When you have
more than 7+-2 items you need to chunk them into pieces. Thus area codes.
888-123-4567 are born. It just makes it easier for the human brain to
process and remember.

I thought that I was bad with 22 variables in an object. The most
extreeeeeeeeeeeeme case of too many variables I've come across is a
"relational table" that someone insisted on translating directly to one
class of object. The table, and then the object, had close to 300
fields/instance variables. They were nuts to have done this. Even after
refactoring it one of the object classes had over 50 variables! Uggg. I
encouraged them to refactor some more, but they refused because the code
was already working and the project was nearing completion. Yuck. 

Good design and implementation is not cut and dry. It's fuzzy. Guidelines
are just like a hiking tail through the woods. It's ok to leave the trail
sometimes as long as you can backtrack or have a compass and map to get
back to the trail.

If it's at all possible keep your methods slim and easy to read.

All the best,