It is said that C's type system is unsound, which means that it has "false negatives", it tells the programmer everything is fine, but then the code fails at runtime. for example, "the program can reach a state where an expression evaluates to a value that doesn't match the expression's static type"
I learned it from the lecture notes of the course CSE341 in university of Washington which are available online.
It says:
But we can give a more precise description of correctness by defining the terms soundness and completeness. For both, the definition is with respect to some thing X we wish to prevent. For example, X could be, "a program looks up a variable that is not in the environment."
A type system is sound if it never accepts a program that, when run with some input, does X.
A type system is complete if it never rejects a program that, no matter what input it is run with, will not do X.
So, basically a language is unsound if a program type-checks and claims there is no unwanted behavior, yet that unwanted behavior occurs at runtime.
For instance, in C we can write:
#include <stdio.h>
int main() {
// in grams
float medicineAmounts[] = {12, 8, 5.5};
float totalAmount = 0;
for(int i = 0; i < 5; i++) { // mistake: should be i < 3
totalAmount += medicineAmounts[i];
printf("%f\n", medicineAmounts[i]);
}
printf("total: %f grams", totalAmount);
// outputs: total 607785.50 grams :D
return 0;
}
My question is: while the language is clearly unsafe, is it really unsound? We expect the expression medicineAmounts[i] to always evaluate to a value of type float (which it does—the garbage values are still bit patterns interpreted as floats). But does the C standard actually promise that out-of-bounds access will work correctly? Since the C standard says it's an "undefined behavior", and undefined means the implementation can do anything, why call it unsound when C doesn't consider it a problem?
In other words: can a language be called "unsound" if it never promised the safety guarantees it's accused of violating?
Edit
this Wikipedia article is very nice
and full good insights and links to good resources
what precisely it means for a program to be "well typed" or to "go wrong" are properties of its static and dynamic semantics, which are specific to each programming language. Consequently, a precise, formal definition of type soundness depends upon the style of formal semantics used to specify a language
In 1994, Andrew Wright and Matthias Felleisen formulated what has become the standard definition and proof technique for type safety in languages defined by operational semantics,[4] which is closest to the notion of type safety as understood by most programmers [...]
Andrew Wright and Matthias Felleisen defined one proprty that seems to answer my question:
Progress: A well-typed program never gets "stuck": every expression is either already a value or can be reduced towards a value in some well-defined way. In other words, the program never gets into an undefined state where no further transitions are possible.
By that definition, I think, it makes the C example program I wrote above unsound.
Finally, perhaps it's worth noting, Matthias Felleisen once said in a presentation on Youtube, that many academics think of soundness as a yes/no matter, and he, as an academic and a programmer, believes it's not practical criteria.
he said, we should think of it as a spectrum, we went from really evil (C) to pretty ok (e.g. Java)