This was a great article -- thanks for posting it, Marcus! The authors did a VERY careful job with the experimental design and the evaluation. This is the first serious empirical effort I'd seen on measuring the value of types.
Unfortunately, a well-controlled experiment usually means a pretty narrow finding, and that seems to be the case here. The main finding here is that programmers who used a more strongly typed C (ANSI C) made fewer type errors when connecting with a library that had typed interfaces (Motif) than programmers using a less strongly typed C (K&R C). The point is well made and well supported, but it doesn't answer all of the typing issues. For example, it's not true that they made fewer overall errors -- there was no significant difference between the groups on severe errors that were not related to type.
In particular, this experiment doesn't tell us anything about the value of type systems to catch errors in problem and program understanding. Consider, for example, designing an appointment book where there are user-defined types Appointment, Date, Contact, AppointmentBook, and so on. A good type system should be able to catch when you're passing in a mere Date to something that expects a whole Appointment (presumably associated with a Date and perhaps a Contact), though for some uses, Appointments and Dates may support identical interfaces, so it should be acceptable to use them interchangeably if only the common protocol is being used. Does having a strongly-typed language with this kind of flexibility help you (i.e., catch errors that you might not catch otherwise) or hurt you (i.e., cost you more in dealing with type declarations and coercions than you save in debugging time)?
Coming up with an experiment or set of experiments to really explore the value of types is a challenge. There are lots of language issues to deal with and language issues to deal with. For example, in the Prechelt and Tichy paper, all the programmers in the study were quite experienced in C, which allowed them to tackle a larger-than-toy program , but also meant that they may have already developed idioms and practices which may have led them toward being more successful in ANSI C than K&R C. It's a hard question to answer.
Mark
On Thu, Jan 27, 2000 at 10:11:55PM -0500, Mark Guzdial wrote:
But this is actually more than me being "pedagogues who look on their profession as an opportunity for pederastic abuse" :-) I'm seriously interested: Does anyone know of any empirical evidence for the value of types? Or is it a myth that we invented to rationalize the typing needed to improve the compiler's performance?
Lutz Prechelt, Walter F. Tichy. A Controlled Experiment to Assess the Benefits of Procedure Argument Type Checking.
IEEE Trans. on Software Engineering, 24(4):302-312, April 1998.
http://wwwipd.ira.uka.de/~prechelt/Biblio/tcheck_tse98.ps.gz
-- Marcus Denker marcus@ira.uka.de phone@home:(0721)614235 @work:(0721)608-2749
-------------------------- Mark Guzdial : Georgia Tech : College of Computing : Atlanta, GA 30332-0280 (404) 894-5618 : Fax (404) 894-0673 : guzdial@cc.gatech.edu http://www.cc.gatech.edu/gvu/people/Faculty/Mark.Guzdial.html