The 'Unique' type constructor in Ten15

(Examples in a ficticious, hopefully obvious strongly-typed, higher-order language)

'Unique' is a type constructor, like 'list'. That is, a (compile-time) function from types to types. Thus:

a: unique int;
b: unique (int -> int);

Declares a and b to be unique ints, and unique 'functions from int to int', just as:

l: list int;
m: list (int -> int);

declares l to be a list of ints, and m to be a list of 'functions from int to int'.

The representation of a value with the type 'unique X', for some type X, is a globally unique identifier, much like the GUIDs used by COM, etc., created when the unique value is created. Thus unique values are comparable, even if the types they are applied to are different, so that after the execution of:

a: unique int;
b: unique int;
c: unique int;
d: unique float;

a := new unique int;
b := new unique int;
c := a;
d := new unique float;

all of the comparisons, a == a, a == b, a == c, a == d, b == c, etc., are valid, but only a == c (and of course, a == a, etc) will be true. Note that a == b is false; the unique type constructor does not attempt to return a result specifically for int, or any other type - every call returns a new, unique value, and:

x := new unique void

would be a good way of producing GUIDs for their more traditional uses.

So, given that unique values hardly ever seem to match, and then only match with themselves, why are they useful? Consider the type:

∃ T . (unique T × T)

That is, the type of a pair whose first element is a unique for some, unknown, type, and whose second element is a value with that unknown type. Now say that we already have a unique value, u say, then we can compare u with the first element of a value with this pair type (since, as we saw above, unique values can be compared, regardless of the type that they carry).

It is most likely that the result of the comparison will be false, but what happens if the result is true? Since we know that every construction of a new unique value returns a different GUID, we can conclude that u and the first element of the pair must be copies originating from the same construction operation. Furthermore, we can conclude that, since u has the type 'unique int', the construction operation must have been 'new unique int', and hence that the unknown type in the pair must also be 'int'!

This argument is not changed by replacing 'int' by any arbitrarily complex type, and so the unique type constructor gives us a type-safe mechanism for comparing arbitrarily complex types, for the run-time cost of comparing two GUIDs!

The practical application of unique values

Of course, this only works if we can arrange to have the correct unique values to hand, but in the situations where unique is most useful, this in not difficult to achieve. Consider the separate compilation of modules: traditionally when a module interface specification is compiled, the result is the type of the module, encoded in some way. However, one of the goals of Ten15 was that the entire system should be strongly typed, including the operations of compiling, linking and running applications and modules. The type of a module would be:

void -> M

or, more likely:

L -> M

where L is the type of all the other modules, system linkages, data, etc, that the module needs to link to in order to do it's job.

When we compile an implementation of the module the compiler proves that the implementation gives a module with the type L -> M, and returns a code object with the type:

Type × (L -> M)

a pair, the first element of which is a run-time representation of the type of the second element, which is the run-time value of the module - usually a record consisting of exported functions and data. This pair gets reduced to the type of a 'code object' or code file:

∃ T . (Type × T)

since all code objects must ultimately have the same type.

The application is also compiled with the knowledge of the module's interface specification, and so it also knows that the type of the module is L -> M.

Now consider the run-time operation of loading, linking and using the module. Loading the module consists of getting it's implementation into memory, and is beyond the scope of this discussion except that an object with the type ∃ T . (Type × T) is returned. Linking the module into the running application is a two step process, first it must be proved that the module does indeed provide an implementation with the type expected by the application, in this case, L-> M. The second step is calling this linking function with the correct linking parameters to get a module instance (with the type M), that can be used in the rest of the application.

The first step in this process is potentially quite expensive, since it involves a run-time compatibiltiy check between the type resulting from the compilation of the module interface, and compiled into the application, and the type of the module, represented by the first element of the pair returned by the loading phase. This process notoriously slows down the linking and startup of java applications. However, this cost can usually be completely avoided using unique values, as follows.

When the interface is compiled, the resulting type of the module becomes:

unique (L -> M) × (L -> M)

and this is stored in (probably actually is!) the compiled representation of of the module's interface.

The compiler produces an implementation of the module, which it proves has the type L -> M (doing all the costly type comparisons necessary), and includes the unique in the code file that results - giving it the type:

∃ T . (unique T × T)

The application is also compiled using the compiled representation of the module interface, and so the unique s known to it too. The loader returns the pair above, but the first step of the linker can now compare the unique that was compiled into it with the unique in the loaded module. if they match, then we know that the loaded module was compiled against the same interface specification as the application, and no run-time comparison of the type representations is necessary.

Of course, the module type could be defined to be:

∃ T . (unique T × Type × T)

allowing the long-hand type compatibility check to be done if the comparison of uniques does not provide the type compatibility proof. There would be the run-time cost associated with the traditional linking technique, but this would allow compatible recompilations of the interface without requiring the recompilation of all the applications and implementations - including this would be a choice that the system designer could make, when defining the linking alogorithms. One disadvantage of this would be the worry that over time all linking operations would degenerate to the long type-comparison form. A cleverer module interface compiler could attempt to prove the compatibility of a new module specification with the unique in the old compiled representation, and so avoid this problem to some extent.

Conclusions

The unique type constructor, as defined in Ten15, provides an efficient, type-safe mechanism for proving the type-compatibility of values coming by different routes to a place in the system where they are to be combined. Values could come from disk files, or even come from other machines, across networks (Ten15 disk storage, and network communicates were both strongly-typed). In most cases it avoids the necessity for a run-time comparision of type information, which is a open-ended, and often large cost.