Words Of Code (was Re: Squeak in IEEE Software)

Thu Feb 4 19:43:09 UTC 1999

On Mon, 1 Feb 1999, Dan Ingalls wrote:

> I doubt that any clarity will emerge on this topic.  Re-use is very hard
> to quantify.  However it is simple to state the ideal:  every
> fundamental relationship or process should only be described in one
> place.  To the extent that something appears in N places, there are N
> times as many chances to get it wrong one way or another.  Also it
> impacts testing a lot.  Often when we make a change in Squeak, it gets
> tested before the browser finishes displaying the change.  Reuse has
> this ancillary benefit of leverage in testing. 

Probably the closest you can come to measuring re-use is to measure the
amount of functionality in an app versus the amount of source code. 

Measuring the amount of functionality is probably best done using
something like "function points" or "feature points" in the application
spec (or the running application).  I'm not really an expert on the best
way to do this, but the important thing is to not base this measurement on
the source code. 

Measuring the amount of source code can be done three ways, by counting
the lines of code, the words of code, or the number of characters in the
code.  I would argue that words of code (WOC) is the best measure of
these.  Lines of code (LOC)is somewhat arbitrary... especially in
Smalltalk, it's easy to send a series of messages to an object with a
single line of code, or to spread the messages out over a few lines. 
(Although I suppose if you use a standardized code formatter, LOC is a
reasonable measurement.)  Characters of code (COC) is an even worse
measure, because a developer who used single letters for variable names
would appear to have a smaller amount of source code. 

So, if you assume that function points are a reasonable measure of
application functionality (which may not be a rock-solid assumption),
then: 

Code Reuse = # of Function Points / Words of Code

How's that for quantifying code reuse? :-)  I'm interested in other
opinions on this... 

> The lines of code thing is just plain confusing.  Generally less is
> better, and I just don't think you can compare systems this way. 

I assume you're talking about the "defects per 1000 lines of code" that
the IEEE article referred to... I agree that that is a mediocre way to
measure code quality.  Simply shooting for less lines/words of code is
definitely more beneficial than trying to reduce the defect rate. (the
code is more maintainable, testing/debugging is easier, etc.) 

- Doug Way
  EAI/Transom Technologies, Ann Arbor, MI
  http://www.transom.com
  dway at transom.com, dway at mat.net

  Smalltalk: Guaranteed Y10K Compliant