Type Safety (was Re: fun and empowerment)

Tue Feb 1 23:18:42 UTC 2000

	On Sat, Jan 29, 2000 at 02:21:04PM -0500, David N. Smith IBM" wrote:
	restrictions.  For example, if a processor has 32-bit registers and a
	the compiler implements a 16-bit short type, the compiler has to make
	sure that the results of short arithmetic are truncated down to 16
	bits after each arithmetic operation.

False, and false for two reasons.

First, the C standards (K&R 1, C89, C99) all specify that short-than-int
operands are widened to int; this means that on a machine that has 32-bit
registers and no 16-bit multiply, divide, or compare, any sane compiler
writer will rule that int=32 bits and do the arithmetic in the 32-bit
registers as 32-bit numbers.  Similarly, on a 64-bit machine, your 16-bit
numbers will be added, subtracted, multiplied, and so on as SIXTY-FOUR BIT
numbers.  Suppose for example that 'short' is 16 bits and 'int' is 32 bits,
and that the C system reports signed integer overflow by raising an exception
(which the standards very carefully allow, and which some C systems in fact
d0), then
	short x = 1000, y = 1001, d = 10;
	short c = (x/d)*(y*10) > x*y;
*must* be computed correctly; not only is it not the case that the compiler
must truncate, but it must NOT truncate!

Second, suppose a C compiler writer were sufficiently loopy to impose
int=16 bits on a 32-bit machine.  The "as if" rule applies.  The ISO standards
make it clear that a C system doesn't have to do literally what you see, only
something that gives the same answers.  So on a machine where there are
addition and subtraction instructions that don't trap on overflow, any
amount of arithmetic involving +, -, ~, &, |, and << can be done without
adding truncation instructions.  They would only be needed before
*, /, %, >>, and comparisons.

	If C didn't have types--just
	had register-sized variables that you could treat however you
	wanted--the code generated would be more efficient, at least on a
	mediocre compiler.

C *does* have register-size variables.  They're called 'unsigned int'.

	Types are a big win in C because they help the compiler catch stupid
	but fatal errors (e.g. "foo = NULL;" vs. "*foo=NULL;").

Er, C compilers are notorious for NOT catching that particular problem.
The usual definition for NULL is a plain 0 (0 cast to void* was supposed to
be legal, but there were dodgy details like (char*)0 being a "null pointer
constant" but (char*)(void*)0 NOT being a null pointer constant, so it was
much simpler all round to use the normal definition), so
	char a; char *p = NULL; p = &a; *p = NULL;
is correctly allowed by most C compilers.

	I don't think strong typing should be evaluated in a vacuum.

Neither do I, but on any reasonable interpretation of the term
"strong typing", C hasn't got it and never has had.  To see the merits
of strong typing, look at Ada, or Eiffel, or Haskell, or Clean, or
Mercury.

	For a good example of a safe but truly typeless language, try Tcl.  In
	Tcl, everything is a string.  This is flexible but the language
	doesn't "know" the type of anything and so rarely is able to do any of
	those automatic type-specific things most other languages do
	(e.g. automatically formatting output in a "correct" manner).

This hasn't been true since the release of Tcl 8.0, which was some time
ago know.  These days, the Tcl system knows perfectly well whether something
is a number or a list or whatever; things happen to be strings at the same
time.  This wasn't done to catch errors, but for the sake of the compiler
that was also released in Tcl 8.0.

	The proof of this can be seen in how strongly-typed languages try to
	implement this sort of behaviour.  (Templates, anyone?)

For "strongly-typed languages" read "SOME strongly-typed languages".
C++ templates are fairly extreme; Eiffel and Ada do it better.
Then there's Haskell, Clean, and Mercury, which use "type classes".