The 'Unique' type constructor in Ten15
(Examples in a ficticious, hopefully obvious strongly-typed, higher-order
language)
'Unique' is a type constructor, like 'list'. That is, a (compile-time)
function from types to types. Thus:
a: unique int;
b: unique (int -> int);
Declares a and b to be unique ints, and unique 'functions from int to
int', just as:
l: list int;
m: list (int -> int);
declares l to be a list of ints, and m to be a list of 'functions from
int to int'.
The representation of a value with the type 'unique X', for some
type X, is a globally unique identifier, much like the GUIDs used
by COM, etc., created when the unique value is created. Thus unique values
are comparable, even if the types they are applied to are different, so
that after the execution of:
a: unique int;
b: unique int;
c: unique int;
d: unique float;
a := new unique int;
b := new unique int;
c := a;
d := new unique float;
all of the comparisons, a == a, a == b, a == c, a == d, b == c, etc.,
are valid, but only a == c (and of course, a == a, etc) will be true. Note
that a == b is false; the unique type constructor does not attempt to return
a result specifically for int, or any other type - every call returns a
new, unique value, and:
x := new unique void
would be a good way of producing GUIDs for their more traditional uses.
So, given that unique values hardly ever seem to match, and then only
match with themselves, why are they useful? Consider the type:
∃ T . (unique T × T)
That is, the type of a pair whose first element is a unique for some,
unknown, type, and whose second element is a value with that unknown type.
Now say that we already have a unique value, u say, then we can compare u
with the first element of a value with this pair type (since, as we saw above,
unique values can be compared, regardless of the type that they carry).
It is most likely that the result of the comparison will be false, but
what happens if the result is true? Since we know that every construction
of a new unique value returns a different GUID, we can conclude that u
and the first element of the pair must be copies originating from the same
construction operation. Furthermore, we can conclude that, since u has
the type 'unique int', the construction operation must have been 'new unique
int', and hence that the unknown type in the pair must also be 'int'!
This argument is not changed by replacing 'int' by any arbitrarily complex
type, and so the unique type constructor gives us a type-safe mechanism
for comparing arbitrarily complex types, for the run-time cost of comparing
two GUIDs!
The practical application of unique values
Of course, this only works if we can arrange to have the correct unique
values to hand, but in the situations where unique is most useful, this
in not difficult to achieve. Consider the separate compilation of modules:
traditionally when a module interface specification is compiled, the result
is the type of the module, encoded in some way. However, one of the goals
of Ten15 was that the entire system should be strongly typed, including the
operations of compiling, linking and running applications and modules. The
type of a module would be:
void -> M
or, more likely:
L -> M
where L is the type of all the other modules, system linkages, data,
etc, that the module needs to link to in order to do it's job.
When we compile an implementation of the module the compiler proves that
the implementation gives a module with the type L -> M, and returns a
code object with the type:
Type × (L -> M)
a pair, the first element of which is a run-time representation of the
type of the second element, which is the run-time value of the module -
usually a record consisting of exported functions and data. This pair gets
reduced to the type of a 'code object' or code file:
∃ T . (Type × T)
since all code objects must ultimately have the same type.
The application is also compiled with the knowledge of the module's
interface specification, and so it also knows that the type of the module
is L -> M.
Now consider the run-time operation of loading, linking and using the
module. Loading the module consists of getting it's implementation into memory,
and is beyond the scope of this discussion except that an object with the
type ∃ T . (Type × T) is returned. Linking the module into the running
application is a two step process, first it must be proved that the module
does indeed provide an implementation with the type expected by the application,
in this case, L-> M. The second step is calling this linking function
with the correct linking parameters to get a module instance (with the type
M), that can be used in the rest of the application.
The first step in this process is potentially quite expensive, since it
involves a run-time compatibiltiy check between the type resulting from
the compilation of the module interface, and compiled into the application,
and the type of the module, represented by the first element of the pair
returned by the loading phase. This process notoriously slows down the linking
and startup of java applications. However, this cost can usually be completely
avoided using unique values, as follows.
When the interface is compiled, the resulting type of the module becomes:
unique (L -> M) × (L -> M)
and this is stored in (probably actually is!) the compiled representation
of of the module's interface.
The compiler produces an implementation of the module, which it proves
has the type L -> M (doing all the costly type comparisons necessary),
and includes the unique in the code file that results - giving it the type:
∃ T . (unique T × T)
The application is also compiled using the compiled representation of
the module interface, and so the unique s known to it too. The loader returns
the pair above, but the first step of the linker can now compare the unique
that was compiled into it with the unique in the loaded module. if they match,
then we know that the loaded module was compiled against the same
interface specification as the application, and no run-time comparison of
the type representations is necessary.
Of course, the module type could be defined to be:
∃ T . (unique T × Type × T)
allowing the long-hand type compatibility check to be done if the comparison
of uniques does not provide the type compatibility proof. There would be
the run-time cost associated with the traditional linking technique, but this
would allow compatible recompilations of the interface without requiring the
recompilation of all the applications and implementations - including this
would be a choice that the system designer could make, when defining the
linking alogorithms. One disadvantage of this would be the worry that over
time all linking operations would degenerate to the long type-comparison
form. A cleverer module interface compiler could attempt to prove the compatibility
of a new module specification with the unique in the old compiled representation,
and so avoid this problem to some extent.
Conclusions
The unique type constructor, as defined in Ten15, provides an efficient,
type-safe mechanism for proving the type-compatibility of values coming by
different routes to a place in the system where they are to be combined.
Values could come from disk files, or even come from other machines, across
networks (Ten15 disk storage, and network communicates were both strongly-typed).
In most cases it avoids the necessity for a run-time comparision of type information,
which is a open-ended, and often large cost.
Copyright 2002, Martin Atkins
This page can be linked to, and unedited copies can be freely reproduced.