The type of updateable values in Ten15

(Examples in a ficticious, hopefully obvious strongly-typed, higher-order language)

Languages with polymorphic type systems, most notably Standard ML, have struggled to define the type of updateable values (which I will henceforth call variables). For SML, this was partly caused by interactions with the type inference mechanism, but even where type inference was not attempted, the types of variables have not been really satisfactory when subtype relationships come into the picture.

For example, let us say that the types of variables are created by a type constructor - that is, a (compile-time) function from types to types - conventionally called 'ref'. Thus a variable which will contain an integer would be declared by:
v: ref int;
just like:
l: list int;
declares l to be a list of integers.

The operations defined on refs are as follows:
:= : ∀ T . (ref T × T) -> void
!   : ∀ T . ref T -> T
The first defining assignment, so, using the definition of v above, we can say:
v := 10;
(using a suitable infix notation), and the second defining dereferencing, so we can also say:
x: int;
x = !v;
(using a prefix notation).

This is fine, until we start to look at subtype relationships.

Problems with the traditional ref constructor and Subtypes

Subtype relationships are of the form S ≤ T, where S and T are types, and state that S 'has at least all the features of' T, and thus that values with the type S can be used in any situation that requires a value with the type T. This is the type foundation of inheritance-like compatibility rules in object-oriented languages, and is usually defined on a type-by-type basis, thus:
identity
S ≤ S
is always true
lists
list S' ≤ list S
iff  S' ≤ S
pairs
(S', T') ≤ (S, T)
iff  S' ≤ S & T' ≤ T
records
{a: S', b: T', c: U, etc} ≤ {a:S, b:T}
iff  S' ≤ S & T' ≤ T
functions
(S' -> T') ≤ (S -> T)
iff  S ≤ S' & T' ≤ T
etc


The interesting cases being lists and pairs, which have the sort of useful subtype relationship that we desire for variables, records that allow extra fields in subtypes, and functions that display contravarience - the requiement on the parameter type is the reverse of that initially expected. The 'usual' relationship displayed elsewhere is called covarience.

The problem with the ref type constructor is that the best rule that is usually come up with is:
ref S' ≤ ref S        iff    S' = S
which is not particularly useful (except in that it is at least safe!).

Motivation for variables in Ten15

If we look again at the operations which can be applied to a variable, we see that we can also think of a variable as a pair of accessor routines, so that:
ref T ≡ (T -> void, void -> T)
and assignment and dereferencing could be implemented by:
:= (v, x) = (first v)(x)
! (v)       = (second x)( )
(if the two accessor functions shared access to a variable in a common scope, then this could actually work! But not as efficiently as a conventional implementation.)

Variables in Ten15

Looking at the definition in the previous section, we see that the type of the value stored in the variable occurs twice in the definition. Ten15 avoids many of the problems of conventional variables by unlinking these two occurances, thus:
ref[S => T] ≡ (S -> void, void -> T)
(with the consistency constraint that S ≤ T)

Here S is the type of values that can be assigned to the variable, and T is the type of values that can be got back back dereferencing the variable. ref[T] becomes simply a short hand for ref[T=>T].

Immediately we see (because S occurs contravariently in the assignment accessor, and T occurs covariently in the dereference accessor) that this version of ref satisfies the subtype relationship:
ref[S' => T'] ≤ ref[S => T]        iff    S ≤ S' & T' ≤ T
just like the rule for functions.

In Ten15 all updateable datastructures - variables, arrays, etc - have two types, in this way.

Uses of extended variable types

Aside:
We need to introduce two more, somewhat special types:
^ is the top type
Top is the type containing all values (i.e. the type whose value set is the universal set). Top is the top element in the type lattice, and every other type is a subtype of top, i.e. ∀ T . T ≤ ^. Thus, wherever a value of type top is required, any other value can be given. However there are no operations on values of type top, so there isn't very much useful that can be done with it! Top can also be thought of as the union of all other types:
^ =  ∀ T . T
Top is usually represented by a symbol which is an upside-down bottom (⊥), but this is not in HTML's repertoire, so I'm using '^' here instead!
⊥ is the bottom type
Bottom is the type that has no values (i.e. the type whose value set is the empty set). Since there are no values with the type bottom it is not possible to call a function that has a parameter type of bottom. Bottom is often used to represent the value returned by the non-terminating computation, but it has a possibly more respectable definition as:
⊥  =  ∃ T . T
Bottom is the bottom element in the type lattice, and a subtype of every other type, i.e. ∀ T . ⊥ ≤ T
There are some useful coercions for these extended refs, which aren't possible with conventional variables:
make_readonly:    ref[S => T] -> ref[⊥ => T]
make_writeonly:  ref[S => T] -> ref[S => ^ ]
ref[⊥ => T] is the type of a read-only variable, since the only values that can be assigned to it are values of the type bottom, and there are no such values, and ref[S => ^ ] is the type of a write-only variable, since dereferencing the variable returns a value of type top, which cannot be manipulated.

These coercions change the type of the binding to the variable, they do not change the variable itself. Thus it is possible to create a variable, and give a write-only binding to the variable to one part of a program, and a read only reference to another. This enforces one-way communication through the variable.

Subtype relationships for variables

Note that:
ref[S => T]    ≤    ref[⊥ => T]            ∀ S, since ⊥ ≤ S
and:
ref[S => T]    ≤    ref[S => ^ ]            ∀ T, since T ≤ ^
So that wherever a read-only or write-only reference is needed, it is always possible to supply a read-write reference.

Also we find that some references have something close to the subtype relationship that we desired in the introductory sections:
ref[S => T']    ≤    ref[S => T]            iff            T' ≤ T
and the special case for read-only references is:
ref[⊥ => T']    ≤    ref[⊥ => T]            iff            T' ≤ T
My thesis showed how this can be used to provide a self reference when implementing objects.

Notes

Note that the subtype relationship preserves the 'consistency constraint' mentioned above, in the sense that if we have a variable type:
ref[S => T]
that satisfies the 'consistency constraint', i.e. S ≤ T, and:
ref[S => T] ≤ ref[S' => T']
then:
S' ≤ T'            (since S' ≤ S ≤ T ≤ T')
i.e., the consistency constraint on the ref with the supertype is automatically satisfied. This means that substituting a valid variable into a context where a more general variable was expected will still be a valid variable. A very useful property! Since otherwise we would have to do a run-time check to check the consistency constraint whenever a compatible subtype was used. Since all variables are created with well-known types that can be checked, this means that all other variable bindings that can actually be used will automatically be consistent - those that are not consistent will never find a real variable that is type compatible with them!

However, the opposite is not true - if we start with a variable satisfying the consistency constraint, and then construct subtypes by finding supertypes for the assignment type, and subtypes for the dereference type, then, at some stage, the variable will not be consistent. Fortunately this is not something which in practice we want to do very much.

Copyright 2002, Martin Atkins
This page can be linked to, and unedited copies can be freely reproduced.