notes-computer-programming-programmingLanguageDesign-prosAndCons-nullableTypes

Florian Weimer 10/22/10 Re: [go-nuts] Re: "The Language I Wish Go Was"

> even in languages with no nil, such as Haskell, you can still get > exceptions from using things that have an unexpected form. > > e.g. head [] > > that's not too different from a nil pointer exception.

(not a Go programmer)

That's just like division by zero. The problem with null pointers is that they tend to propagate quite far from the place where they were created, making error handling difficult (and debugging, too). I suppose that once nullability ends up in the type system, this type of propagation is restricted. There is a cost of making the type system more expressive.

Here are a few examples. Go currently accepts this code:

func print(value string) {
fmt.Printf("%s\n", value)
}
var x string
print(x)

If you introduce nullable types, that would probably not be allowed anymore. This might not seem so bad, but you also lose the ability to write:

var x string
if condition {
x = func1()
} else {
x = func2()
}
print(x)

This might be fixed by allowing if statements in conditions. However, this:

var x T1
var y T2
if condition {
x = func1a()
y = func1b()
} else {
x = func2a()
y = func2b()
}
print(x)

is more readable than the ML-style

x, y := condition ? (func1a(), func1b()) : (func2a(), func2b())

which tears apart declarations and their values.

But if you want to allow the statement variant in the face of nullable types, you need to track definite assignment. Even in itself, this is not a lightweight feature. But it's fairly straightforward (except for object creation; more on that below). Once you've that, the following becomes illegal:

func set(ptr *string) {
*ptr = "hello, world"
}
var x string
set(&x)
print(x)

An ordinary pointer makes no guarantuees about write access. For this reason, many languages with definite assignment throw out pointers altogether. We could introduce definitely-assigned pointers. These could only be used in functions arguments, and the compiler enforces that on every normal execution path, the function returns only after an assignment to the pointed-to value. And at least prior to that assignment, the pointed-to value could not be read. This feature would have additional interactions with function types and general type compatibility. I fear that making these new pointer types first-class (that is, usable outside the function on which they are a parameter) puts us well into the range of researchy-languages.

There's another oddity: non-atomic object construction. If you construct an object incrementally, without additional, rather complex machinery, you'll see null values in nullable types. (Java has a related problem: nominally constant object fields can change their values during the execution of a constructor, and this is observal by programs. It is sometimes abused to create cyclic immutable-after-creation data structures.)

I think the argument above is not a mere eristic device (even though it's structurally similar to a fake reduction ad absurdum). I'm not entirely convinced that the bad reputation of null values comes from their use as an error indiciator (especially in languages which lack multiple return values). It will be interesting to see how many bug reports for Go applications will contain hard-to-understand tracebacks caused by a attempt to derefence a nil value. In this context, it is a bit odd that a[x] returns a zero value and does not panic (security concerns aside, the panic could contain a hint regarding the key, making the error more understandable).

toread:

https://groups.google.com/forum/#!topic/golang-nuts/rvGTZSFU8sY[1-25-false]

Russ Cox 10/22/10 Re: [go-nuts] Re: "The Language I Wish Go Was" > The impression I got from reading (now and back then) the "Repeating > the billion dollar mistake?" discussion is that, in the minds of the > original Go creators, non-nullable pointers have little value. The > generic reason for this is that Go is *not* a language built for the > purpose of fulfilling the needs of those programmers who want to > translate their ideas into code with as much precision as possible or > want the compiler to "understand" what the programmer means.

Wow. That's pretty harsh.

I don't think it's fair either. If there were a way for the type system to help avoid nil pointer dereferences without completely changing the style of the language, then I at least would be interested. But as I said yesterday, that thread did not have any proposals that both (a) worked and (b) left the language intact. It's certainly true that if we moved to an ML-style model (ignoring the polymorphism for now) for defining types and constructing data, that might work (but see roger's reply). But it would be such an invasive change that I don't think the result would be recognizable as Go.

Russ

> I don't think it's fair either. If there were a way for the type > system to help avoid nil pointer dereferences without completely > changing the style of the language, then I at least would be interested.

Based on previous discussions with you and others here, and based on what I read on golang-nuts so far, I have doubts whether any proposal to add something to the language will be accepted. Nevertheless, let me try (I am not hoping for anything to happen):

I propose to add "#T" to the language. It means "non-null pointer to an object of type T". The language rules for handling values of type "#T" are basically the same as the rules for handling "*T", except that "#T" does not accept nil. A value of "#T" can be assigned to a variable of type "*T" at any time (see also "postponed initialization" described below).

Without adding anything else, this already makes "#T" usable in modelling a lot of situations.

A value of type "*T" can be converted to type "#T" by an if expression:

var x *T = fn()
if x != nil {
x has type "#T"
}
x has type "*T"

(In the following text, "invalid" means a compile-time error.)

Contrary to languages such as C++, there is no automatic conversion from "T" to "#T":

var x T
var y #T = x Invalid
var y #T = &x OK

The expression "y := &x" still declares a variable of type "*T". No legacy Go code is broken by this proposal.

A value of type "#T" has no default initial value:

var t #T Invalid
type S struct { x #T; y *T }
var s S Invalid
s1 := S{ y: nil } Invalid
s2 := S{ x : &T{} } OK, assuming T{} is valid

It is prohibited to use type "#T" in data structures where the value of type "#T" is part of a cycle formed during initialization of the data structure. In such cases, the programmer has to use "*T".

The initialization of a value of type "#T" can be postponed by using the following pattern:

func F(#T) { ... }
func G(*T) { ... }
type T struct { i int }
var t #T
var p *T
t.i = 1 Invalid
p = t OK, equivalent to "p = nil"
if condition {
F(t) Invalid
G(t) OK, equivalent to G(nil)
t.i = 1 Invalid
t = &T{}
t.i = 17 OK
F(t) OK
G(t) OK
} else {
F(t) Invalid
G(t) OK, equivalent to G(nil)
t.i = 1 Invalid
t = &T{}
t.i = 23 OK
F(t) OK
G(t) OK
}
t.i = 29 OK
F(t) OK
G(t) OK

In other words, the compiler keeps track of whether a conditional block of code has initialized "t" or not. In addition to this, an

func Initializer(t *#T) {
if condition {
*t = &T{31}
} else {
*t = &T{37}
}
}
var t #T
println(t.i) Invalid
Initializer(&t)
println(t.i) OK

Considering that coding patterns such as this one are *not* common, I recommend *not* to implement this extension. (It can be implemented in future if the need arises.)

... have a nice weekend thinking about this ...

> I propose to add "#T" to the language.

This proposal came up before. The main sticking points are the ones you identified.

In general the Go team has been very conservative about language changes, only making them when it everything is exactly right. Can those issues be made exactly right? I don't know.

Ian

" An absolute horror is having a 3rd-party library which defines "struct S { Field #T ... }" and you want to use "S" in an initialization cycle or in code the compiler is unable to understand - this case is really painful. It can be "solved", by forbidding any public fields of a structure to have type #T. "

"

A type-switch may seem problematic at first, but (maybe) it is not that hard:

var t interface{}
switch t.(type) {
case #T:
}

The meaning which I would propose is that "case #T" is (informally speaking) equivalent to "(case *T) and (t != nil)".

> In general the Go team has been very conservative about language > changes, only making them when it everything is exactly right. Can > those issues be made exactly right? I don't know. > > Ian

Whether the #T-related issues can be made exactly right: I think they cannot. It seems than in the case of #T (i.e: non-nullable type) there always exists a line the compiler won't be able to cross. The number of cases a compiler would fail to understand is infinite. "