The first guiding principle of any solution is to avoid shadow constructs. We already have parameterization support, let's use it.
Generics are functions from types to types, typically classes to classes.
Pedantic note 3: If your language is prototype based, generics might be considered functions between prototypes. If your language has primitive types - well, you're up the creek without a paddle anyway. There is no justification for primitive types in an object oriented language.
If classes are expressions, we can write reified generics as ordinary functions. Here's some sample pseudo-code. It's given in a quasi-standard syntax, so I don't waste time explaining Newspeak syntax.
public var List = (function(T) {
return class {
var hd, tl;
class Link {
public datum;
public next;
}
public elementType() { return T}
public add(e) {
var tmp := Link new.
tmp.datum := e;
tmp.next := hd;
tl := hd;
hd := tmp;
return e;
}
}
}).memoize();
Here's a summary of what the above means:
- We declare a variable named List, initialized to a closure.
- The closure takes a class T as a parameter and returns a class as its result.
- The result class is specified via a class expression, which implements a linked list.
- The class expression includes a nested class declaration, Link.
- The method memoize() is called on the closure to, well, memoize it. Memoize() returns a memoized version of its receiver.
Each call to List() returns a list class specialized to the actual parameter of List(). We can create a list of integers by saying
var lst := List(Integer).new();
and we can dynamically check what type of elements lst holds
lst.elementType(); // returns Integer, the class object reifying the integer type.
The reified element type is shared among all instances of a given list class, because it is stored in the closure surrounding the class. We avoid duplicating classes with the same parameters - this is just function memoization (and I assume a memoize() method on closures for this purpose). All this works independent of any static types. We are just using standard runtime mechanisms like closures, memoized functions, objects representing classes and yes, class expressions. Smalltalk had these, in essence, over 40 years ago.
What if I don't have class expressions? Well, don't you know that everything should be an expression? Besides, this works fine if you have the ability to define classes reflectively like Smalltalk, or have properly defined nested classes like Newspeak, though it may be a bit more verbose and require more library support to be palatable.
Now let's add types. In the code below, type annotations are completely optional, and have absolutely no runtime effect. They are shown in blue.
public var List = (function(T : Class) {
return class {
var hd, tl : Link;
class Link {
public datum : T;
public next : Link;
}
public elementType() : Class { return T}
public add:(e : T) : T {
var tmp: Link := Link new.
tmp.datum := e;
tmp.next := hd;
tl := hd;
hd := tmp;
return e;
}
}
}).memoize();
You may notice one odd thing - we use the name of the formal parameter to the closure, T, as a type. This is justified by the following rule.
Rule 1: In any method or closure m, a formal parameter T of type Class (or any subtype thereof) implicitly defines a type variable T which is in scope in type expressions from the point T is declared to the end of the method/closure.
Next, we need to be able to use the information given by the declaration of List when we write types like List[Integer]. We use the following rule.
Rule 2: If e is an expression of function type with parameter(s) of type Class and return type Class, e's name can be used inside a type expression as a type function; an invocation of the type function e[T1, ..., Tn] denotes the type of the instances of the class returned by the value expression
e(T1, ..., Tn).
We can then write, and typecheck
var lst : List[Integer] := List(Integer).new();
var i : Integer := lst.add(3) + 4;
Oh, and we can still do this:
lst.elementType(); // returns Integer, the class object reifying the integer type.
Similarly, if you wish to create new instances of the incoming type parameters, you should be able to do that in the above regime - though you will have to confront the fact that different subtypes may have different constructors and plan around that explicitly - say, by defining a common construction interface for these types.
The beauty of this scheme is that no runtimes were harmed in the making of this reified generic type system. The type system is completely optional. And this is my point: reification was there all along. The typechecker simply needs to understand this fact and leverage it. The basic approach would work with any language with types reified as values, regardless of whether it has generics.
Interestingly, we now have reification of generics, and erasure, at the same time. The two are not in conflict. Reification is supported by the normal runtime mechanisms, independent of types, which are optional and always erased, carrying no runtime cost or semantics.