Smalltalk = strongly typed (again)

Blake blake at kingdomrpg.com
Fri Oct 15 12:06:59 UTC 2004


On Thu, 14 Oct 2004 16:24:26 -0700, Rick McGeer <rick at mcgeer.com> wrote:

> This gets to the heart of the strong- vs weak-typing issue, and so let's  
> address it.  The notion that there is a clean separation between  
> design-, compile-, and execution time is a myth; and an expensive and  
> destructive one at that.

Can you imagine someone saying that about a building? "It's never really  
going to be finished, ma'am. It's in alpha currently, till we put those  
last doors on. You can move in during the beta period, but the doors don't  
work and you may fall through the stairs. We're gonna leave our tools  
lying around, too."

There is a clean separation =in many cases=. When I send someone a  
program, my purpose is to send them a product that will do what they want  
AND ABSOLUTELY NOTHING ELSE. Large hard-drives and fast CPUs  
notwithstanding, time and space matter. Security matters, too.

> Software isn't a concrete artifact like a boat or a car or an airplane.   
> It's an instantiation of a dynamic interaction with an environment.

A program can work according to specs (and then some). It can be easy to  
grow, modify and adapt, but that doesn't mean that all its incarnations  
must =always= be mutable, say in the hands of those who don't want to or  
shouldn't change it.

> This means paper designs aren't worth the disk space they're written on  
> -- because nobody understands a dynamic interaction until they see it in  
> action.

I have not personally done enough in this area to comment on it. I saw  
some impressive design documents in the early '90s which resulted in large  
programs being run flawlessly on first compile. I believe it's possible; I  
don't know if it's practical or worthwhile.

> In other words, coding is design and design is coding.  This also means  
> that coding never stops, because the environment the code executes in is  
> constantly changing.

In theory, the environment constantly changes; in practice, the changes  
don't matter for a large set of problems. (And most IT directors I know  
work hard to keep the environment static as possible.)

Let's take a simple example: A program to count the number of words in a  
text file. File goes in, words are parsed, output is given. I'm sure I  
have a 20-year-old BINARY around here that can do that. Where's the  
constantly changing environment?

OK, maybe you wrote the program to parse CR/LF and you need to adapt it to  
read just CR or just LF. Maybe you want to change what defines a words  
(hyphenates, two words or one?). Just because you CAN fiddle with it  
forever doesn't mean it's never done. I'd say it's done, and redone, and  
maybe redone again. Rarely does the original program written to the  
original spec stop working; and I find it's often the case that people  
cling to old programs long after newer ones are released, not just out of  
comfort, but because what they had is GOOD ENOUGH. (I've worked with  
mainframe programs that have been unchanged since before I was born!)

You can always say, "OK, now make it play the Star-Spangled-Banner." But  
that's the Microsoft school of development and has less to do with  
delivering products that people want than keeping revenue streams flowing.

> A talented programmer understands this, writes for a  bald environment  
> (in other words, makes very few assumptions about what the environment  
> looks like),

A talented artist, one of my favorites, in fact, said "Art is never done."  
But it must be confessed that he had a problem with finishing things. Had  
he gone over and asked Michelangelo, who finished lots of things, he  
probably wouldn't have found much agreement (but no less talent).

Anyway, your statement does not match up with my experience. I find people  
lots more willing to pay for concrete, non-dynamic software that works on  
their machines, and more reluctant to pay for making it run on machines  
they don't have, and do things they don't want, or possess flexibility  
they don't perceive the need for. (Even when they DO need it.) I think  
"outside the box" in the name of self-defense--but I think =inside= the  
box for the same reason.

> documents his code extensively (because that's where dynamic assumptions  
> are captured),

That's a bad place for dynamic assumptions. There shouldn't be dynamic  
assumptions. Dynamic assumptions, buried in the code, no matter how well  
documented, are hard to root out.

Again, theory and practice: In practice, I've found that my best bets  
there are to get as much information as possible on reasonable limits, and  
then document the bejeezus out of those limits OUTSIDE the code. In one  
actual incident, with code that was around for about 10 years, I had an OS  
limitation, and documented in every place that it might occur: "A unit  
cannot exceed X number of people. Right now the actual maximum is 2/3rds X  
but this may prove to be untrue with other data sources. The program will  
terminate when that happens."

Many years later, when things could change and the code had to be "fixed"  
to adapt, I didn't need documentation IN the code. I had a list of what  
each module did, and tracking down the limitation (which, of course, was  
only in one place) was easy. (This was over 50,000 lines of mostly  
undocumented code.)

> and writes his code in the debugger -- because he's testing as he  
> writes.  That's the principal insight of the late-bound programmer.

I'm a sucker for a good IDE, admittedly. At the same time, not all  
environments have debuggers. And it's really not that difficult to write  
correct code IF you know what you're doing. (And of course, you don't  
always.)

> Strong typing is one of many artifacts of the early-bound programming  
> universe that dominated the sixties, seventies and (through a number of  
> genuinely unfortunate accidents)  persists today despite its  
> obsolescence.  It's an attempt to capture dynamic information in a  
> static context.  Of course it never really works -- the information just  
> isn't present.

I'd disagree with both those assessments: I see static typing important as  
communication between developer and compiler, and present developer and  
future developer; Also, I'd disagree that it never really works. It  
actually works very, very well. It's as easy to write traditional Pascal  
code that actually works as it is hard to write unsubtle C or C++ code.

Objects complicate the matter, however, and objects give us a lot of  
things that we consider more valuable than static typing. (We'd all agree  
that we don't want to throw the baby out with the bathwater. We just can't  
agree on which is which. <s>)

> To take an example, one can I suppose declare that a variable is always  
> a Stack -- but one cannot capture whether the Stack is empty or full, or  
> (without hideous contortions) even what type of object is in the Stack.   
> Better to throw out the notion that one can statically check for any of  
> this, and instead use dynamic checks.

I don't follow this at all. There are a lot of ways to handle these issues  
in statically typed languages. In Object Pascal:

function TSomeClass.DoSomethingWithAStack(AStack: TStack): boolean;
begin
    if AStack.IsEmpty then //do something with the stack
	...

Let's say that's not good enough, since we may have a stack that doesn't  
descend from TStack. Then you'd use an interface. (The code would be  
identical, except instead of TStack, by convention you'd have IStack.)

Unless I'm missing something, this isn't really substantially different  
 from Smalltalk EXCEPT:

-> You MUST define the interface up front

-> You CAN'T pass AStack to DoSomethingWithAStack unless it's the right  
class or implements the right interface

-> You CAN make it as loosey-goosey as you want by not implementing all  
methods of IStack: You lose the static type-checking as a result which  
leaves you with dynamic type-checking.

As for what type of object is in the stack, I thought the point of  
polymorphism was that you didn't care. And it's not at all true that you  
can't find it out, anyway. Again, in OP, you could query the object for  
its class (or any of its ancestors) OR any specific interface.

Thanks for the food for thought!



More information about the Squeak-dev mailing list