I’m still waiting for my computer to typecheck this file, which I concocted yesterday evening. I’m told by someone with a better computer (thanks, Darin) that it does typecheck, and moreover, satisfy the termination checker. It relies unexcitingly on the size-change principle and is easily rendered entirely structural.
What I’ve managed to do (at last!) is to implement an Agda model of Observational Type Theory for a hierarchy of universes, in this case Nat-indexed, but that’s probably not crucial. It’s a reasonably good fit with what we envisage for Epigram 2, once stratified. However, the real thing will not suffer all of the turd-transportation I just did to simulate cumulativity (the embedding of types from one universe into the next). Ouch, that hurt! It’s not pretty, but previously I didn’t know how to do it at all.
Let me try to explain what’s going on…
Basic Types
In our system, we have a bunch of base types (named after the number of values each contains),
0
1 ∋ []
2 ∋ tt, ff
and a bunch of ways to build types up compositionally. We have the dependent functions in Π-types and the dependent pairs in Σ-types, respectively
(x : S) → T(x) ∋ λs → t if x : S |- T(x) ∋ t
(x : S) × T(x) ∋ [s | t] if S ∋ s; T(s) ∋ t
In honour of John McCarthy, I’ll write [a|[stuff]] as [a stuff], including [a|[]] as [a]. (If I were feeling particularly austere, I’d insist that ff is [] and tt is [[]], doing the whole lot with one atom. The budget will run to three.)
Inductive Datatypes
We’ll also need some inductive datatypes. In Epigram 2, we give you a fancy way to encode the description of your inductive datatypes. Here, I’ll be more primitive but just as expressive and give you the Petersson-Synek trees, also known as ‘indexed W-types’ or ‘terminating interaction structures’ or (in Hankovian) ‘Presbyterian families’ or… they don’t have a snappy name, but they’re the best.
They look like W(i:I. s:S(i). p:P(i,s). r(i,s,p)) j, which might take me a wee bit of explaining. The plan is to define a bunch of mutually inductive datatypes, one for each element of the index set, I. The S, P, r components explain the possible structure of the data, and the j is a particular index in I, selecting one type from the bunch. For simple inductive types like the natural numbers, I is 1.
The dots in my notation indicate binding, so the shape set S(i) can depend on a given index i. Most inductive datatypes have a choice of node-shape and perhaps some data stored in the nodes: S(i) gives a type that encodes both. (Note that here, the shape set can be computed from the index, whereas the usual inductive family/GADT way is to read the index of the data from the information in the node, which just isn’t as much fun.) For the natural numbers, we need a shape for zero and a shape for successor, so we take S(i)=2, and let’s say that’s ff for zero and tt for successor.
Each node of an inductive datatype might also have substructures. The position set P(i,s) gives each subnode an ‘address’. Of course, the position set must depend on the shape (and hence the index), because different node shapes will have different subnode structure. Fortunately, we can compute sets from values! For the natural numbers, we need P([],ff) = 0; P([],tt) = 1, because zero has no subnodes and successor has one.
Nearly done! It’s not quite enough to know what recursive positions there are. We also need to know which branch of the family each recursive position expects. That’s what r(i,s,p) is for. We must have r(i,s,p) : I. For the natural numbers, we have r([],ff,⊥); r([],tt,[]) = [], where ⊥ marks arguments which can’t be defined in patterns which thus require no response. We have a single, general-purpose inductive datatype!
What are the data in these types?
W(i:I. s:S(i). p:P(i,s). r(i,s,p)) j ∋ [s | f] if
S(j) ∋ s; (x:P(j,s)) → W(i:I. s:S(i). p:P(i,s). r(i,s,p)) r(j,s,p) ∋ f
That is, a node is given by a choice s of shape (for the given index j) and a function f mapping each recursive position p to some data in the branch of the family selected by r(j,s,p).
Being a perennial overloader (of operators, information, etc), I’ll be cheeky and give myself some type-directed sugar (marzipan?) for functions with finite domain.
(x : 0) → T(x) ∋ []
(x : 1) → T(x) ∋ [t] if T([]) ∋ t
(x : 2) → T(x) ∋ [f t] if T(ff) ∋ f; T(tt) ∋ t
For our type of natural numbers, that lets me code zero as [ff] and successor of n as [tt n].
These inductive datatypes are, on their own, less powerful than the inductive families of Agda, Coq or Epigram 1 (also GADTs) in that they don’t allow you to define a general equality, indexed by pairs (i,j) but inhabited only when i and j coincide. Only if you can decide the equality of i and j, can you code up equality this way. However, if the type theory acquires a separate notion of propositional equality, full expressivity will be restored. And that’s the plan.
A Propositional Fragment
Before we say what propositional equality is, let us consider what a proposition is. A proposition is a boring set, containing at most one value. We can erase them at run time, where we work only with values, because if we ever encounter a value in a propositional set, we don’t need to look at it to know what it is.
Let us select the propositional sets from our collection so far. 0 and 1 are propositional, but 2 is not. (x : S) → T(x) is propositional if T(x) always is. (x : S) × T(x) is propositional if S and T(x) are. Amusingly, W(i:I. s:S(i), p:P(i,s), r(i,s,p)) j is propositional just if S(i) is: the data content of a tree is just what can be stored in its nodes.
We can isolate the propositional fragment of our collection of sets, by introducing a new set-former Prf(P), meaning the set of proofs of the proposition P. It’s not vital to do so, but it can be handy. In Epigram 2, we consider all expressions in Prf(P) to be definitionally equal. That is, we have proof irrelevance built into the system. We have to pay for that by making sure we never pattern match on proofs when we’re trying to do computation on data. Pattern matching, except on 0, discriminates between canonical inhabitants of a type (which match successfully) and hypothetical inhabitants (which cause matching to get stuck), creating the potential for allegedly equal proofs to be treated differently. To achieve proof irrelevance, you need laziness!
A Predicative Hierarchy
If we want to write polymorphic functions, we need to abstract not only over values in sets, but also over sets themselves. One way to achieve this abstraction is to ensure that sets are themselves values in some sort of set. A set of sets is often called a universe. However, having a set of all sets leads to variants of Russell’s paradox. We fight our way out the way Russell did, by giving universes sizes. The universe of all small sets is a big set, the universe of all big sets is a bigger set, per ardua ad astra.
Given some universe U of sets, we can make the next universe Big(U) by
- allowing U itself, Big(U) ∋ U
- embedding all U’s sets, Big(U) ∋ T if U ∋ T
- closing Big(U) over all the usual stuff: 0, 1, 2, (x : S) → T(x), etc
Having a distinguished subuniverse of propositions complicates the picture a little. Each layer is given as a triple of (Set, Prop, Prf : Prop → Set), so we need to construct (Big(Set), Big(Prop), Big(Prf) : Big(Prop) → Big(Set)). We’ll have Prop, Set : Big(Set), Set ⊆ Big(Set), Prop ⊆ Big(Prop), and close Big(Set) under the usual set formers and Big(Prop) under the usual prop formers. This is rather unlike Coq, where there is one impredicative universe of propositions living at the base of the hierarchy, but allowing quantification over higher levels. Here, every layer of sets has its own distinguished fragment of props.
Propositional Equality of Sets and Values
As was ever so, we need to say when sets are equal and when values are equal. Aren’t sets values? Well, yes, but there’s a matter of size to contend with, and that will make it worth our while to separate the two notions.
At every level, we’ll have set equality and heterogeneous value equality
Prop ∋ S ↔ T if Set ∋ S, T
Prop ∋ (s:S)==(t:T) if Set ∋ S, T; S ∋ s; T ∋ t
Here, set equality is much stronger than isomorphism: it indicates run-time compatibility: values of one set can be used, without transformation, wherever values of the other are expected. Similarly, value equality indicates run-time replaceability without transformation: you won’t notice the difference (at least in terms of value) if you replace this with that. The key operators that work with equality are ‘coercion’ and ‘coherence’
coe(S,T:Set; Q:Prf(S↔T); s:S) : T
coh(S,T:Set; Q:Prf(S↔T); s:S) : Prf((s:S)==(coe(S,T,Q,s):T))
Coherence asserts that coercion doesn’t really do anything, and really, it doesn’t. It transports values between provably equal types, but by run-time, whenever we encounter provably equal types, they’re compatible. The compiler erases coe(S,T,Q,s) down to (the erasure of) s.
(I know it’s rather popular these days to consider a set equality which admits nontrivial (hence non-erasable) isomorphisms, e.g. via Voevodsky’s univalence axiom. I am very much interested in supporting a class of construction invariant under isomorphism, but that’s not what’s happening here. By choosing a finer equality, I do prevent the adoption of isomorphism as equality, but that’s not the only way to support isomorphism. I think it’s pragmatically helpful to distinguish the isomorphisms which cost from those which don’t.)
Note, though, that equality of sets lives at the same level as the sets themselves (as indeed would a pair of back-and-forth embeddings). If we were to rely on the value equality to relate sets, equality of sets would be a big prop. It’s good to keep things small when we can.
Coding a Universe Hierarchy in Agda
To construct a universe is to collect a bunch of sets as the image of a function. We choose a type U to encode or ‘name’ the types in our collection, then give a function El from U to the underlying notion of type, whatever that is. I’ll always use Agda’s lowest universe, Set, as the “underlying notion of type”.
The first thing to do is to define the underlying types which you want to collect.
data Zero : Set where record One : Set where constructor [] data Two : Set where tt ff : Two Pi : (S : Set)(T : S -> Set) -> Set Pi S T = (s : S) -> T s record Sg (S : Set)(T : S -> Set) : Set where constructor _,_ field fst : S snd : T fst open Sg public data Wi (I : Set)(S : I -> Set)(P : (i : I) -> S i -> Set) (r : (i : I)(s : S i) -> P i s -> I)(i : I) : Set where _<_ : (s : S i) (f : (p : P i s) -> Wi I S P r (r i s p)) -> Wi I S P r i
Now we can encode the collection as a definition by induction-recursion, giving datatype U and decoder El in tandem. To closed under (x : S) → T(x), we can write
mutual data U : Set where Pi' : (S : U)(T : El S -> U) -> U -- more codes El : U -> Set where El (Pi' S T) = (x : El S) -> El (T x) -- decoding more codes
See? We make use of the decoder, even to give types to constructors of the data.
My plan’s a spot more complex. Each level has two kinds of type: sets and props,
data Ki : Set where set prop : Ki
so I’m going to work with pairs of universes, just lifting everything over Ki
record LEVEL : Set1 where constructor level field uni : Ki -> Set el : {k : Ki} -> uni k -> Set open LEVEL public
The game is now, given one LEVEL, to make the next LEVEL. So we make an inductive-recursive definition of the next LEVEL, parametrized by the LEVEL we know.
mutual data UpU (L : LEVEL)(k : Ki) : Set where U' : {q : isSet k} -> Ki -> UpU L k El' : (T : uni L k) -> UpU L k Prf' : {q : isSet k}(P : UpU L prop) -> UpU L k B' : Two -> UpU L k Two' : {q : isSet k} -> UpU L k Pi' : (S : UpU L set)(T : UpEl S -> UpU L k) -> UpU L k Sg' : (S : UpU L k)(T : UpEl S -> UpU L k) -> UpU L k Wi' : (I : UpU L set) (S : UpEl I -> UpU L k) (P : (i : UpEl I)(s : UpEl (S i)) -> UpU L set) (r : (i : UpEl I)(s : UpEl (S i))(p : UpEl (P i s)) -> UpEl I) (i : UpEl I) -> UpU L k UpEl : {L : LEVEL}{k : Ki} -> UpU L k -> Set UpEl {L = L} (U' k) = uni L k UpEl {L = L} (El' T) = el L T UpEl (Prf' P) = UpEl P UpEl (B' b) = Tt b UpEl Two' = Two UpEl (Pi' S T) = Pi (UpEl S) \ s -> UpEl (T s) UpEl (Sg' S T) = Sg (UpEl S) \ s -> UpEl (T s) UpEl (Wi' I S P r i) = Wi (UpEl I) (\ i -> UpEl (S i)) (\ i s -> UpEl (P i s)) r i
where isSet k is One if k is set and Zero if k is prop. I could have made k an index and restricted the constructors which make only sets, but I feel like staying Presbyterian. Besides, you never need to store Ki values: you always check types against a given Ki.
Note that the U’ constructor effectively says that the sets of small sets and small props are each coded as large sets. The El’ constructor explicitly embeds small sets and props into large sets and props, respectively. Both are decoded by appeal to the incoming LEVEL. Prf’ explicitly maps each large prop to the set of its proofs. Two is a set, but the other type formers have both set and prop versions. The domain of Pi’ is always a set, but the domain of Sg’ matches whatever is demanded. Oh, and the B’ constructor codes Zero and One via Boolean truth values: it saves a little effort later.
We may now iterate the construction, kicking the whole lot off with an empty universe, to get a hierarchy of universes,
data Nat : Set where ze : Nat su : Nat -> Nat Level : Nat -> LEVEL Level ze = level (\ _ -> Zero) Kill Level (su n) = let L = Level n in level (UpU L) (UpEl {L}) _^_ : Ki -> Nat -> Set k ^ n = UpU (Level n) k EL : (n : Nat) -> set ^ n -> Set EL n T = UpEl {Level n} T
with the result that set ^ n and prop ^ n collect the level n sets and props respectively.
We can write the type of the polymorphic identity function (and also the function itself
IdTy : (n : Nat) -> set ^ su n IdTy n = Pi' (U' set) \ X -> El' (Pi' X \ x -> X) id : (n : Nat) -> UpEl (IdTy n) id n X x = x
One issue we’ll face later is that we could just as well have written
IdTy : (n : Nat) -> set ^ su n IdTy n = Pi' (U' set) \ X -> Pi' (El' X) \ x -> El' X
using big functions between lifted sets, rather than lifting small functions. The meaning is the same, but the coding is different. This raises the key question of how to make ↔ cope with variations in lifting which don’t actually induce representational incompatibility.
Before moving on, er, wow! We’ve built a whole hierarchy of universes, closed under the usual stuff, inside the lowest level of Agda! Is there a paradox? Well, we didn’t close each universe level under inductive-recursive definition, so we haven’t just built Agda’s universe hierarchy inside its own lowest level. We’ve built something like Coq’s hierarchy inside Agda’s lowest level. Agda is massively stronger than Coq!
If you want to play this game in Coq, you can’t quite, but you can nearly. You can build a finite hierarchy arbitrarily high, but you need to go up one Coq level for each level you add to your hierarchy. You can define U : Type(1); El : U -> Type(0), but U won’t itself fit in Type(0).
When are Sets Equal, in the OTT sense?
The idea of set equality, ↔, is to deliver the evidence needed by coe to transport values between representationally compatible sets. To achieve proof irrelevance, it’s enough that coe doesn’t pattern match on the evidence except when it’s time to go on strike because the equation is manifestly false.
The definition of ↔ is supposed to be structural. Two sets are equal if they have the same type-former and coincide componentwise. Two families of sets coincide if they give equal sets when instantiated with equal values. So set equality clearly involves value equality.
When are Values Equal?
Value equality is heterogeneous, (s : S) == (t : T), with a conditional sense: ‘if S and T are equal sets, then s and t are equal values’. The nice thing about that choice (and it is a choice: we could say, as in my thesis, ‘S and T are equal sets; s and t are equal calues’) is that we only need to work to reject false equations between comparable things. Lots of equations are vacuously true. In particular, we can let a proof be equal to anything, as whenever the types happen to be comparable, proof irrelevance kicks in.
Equations only have interesting proofs when they’re false: we go on strike when we nail a lie. The definition of value equality should equip us to access false equations. If we’re to falsify the equality of two functions, we need to be able to feed them equal arguments and falsify the equality of the results. If we’re to falsify the equality of two pairs, we need to be free to choose the projection in which we find a discrepancy.
My Horrendous Agda Encoding
This is where things get hideous. The trouble is caused by making El’ a constructor, meaning that the same underlying type gets coded in lots of different ways. We could choose to be pernickety and reject equations between distinguishable codings. However, my plan for stratifying Epigram makes cumulativity silent. To model that, I need to ignore unimportant variations in the placement of El’. But I wasn’t born yesterday (even though I only got the point, yesterday).
One way to be El’-tolerant is to generalize the notion of set equality. Rather than restricting to a single level, consider when a set ^ m can be said to equal a set ^ n in prop ^ p. If we have El’ S : set ^ m, then we’re sure m isn’t zero, and we can just strip El’ and compare S at the previous level. The hassle is that we sometimes need to write propositions at level p which quantify over domain sets arising as components of the set inputs: but these will be at level m or level n, not level p. We need to be sure that we can send sets (and props) from level m or n to level p, and be sure they mean the same thing when they get there. Hence this kludge
data _===_ (X : Set) : Set -> Set1 where Refl : X === X record Embedding (m p : Nat) : Set1 where constructor _&_ field Em : (k : Ki) -> k ^ m -> k ^ p Nop : (k : Ki)(T : k ^ m) -> UpEl {Level p} (Em k T) === UpEl {Level m} T open Embedding public idEm : {p : Nat} -> Embedding p p idEm = (\ k T -> T) & \ k T -> Refl suEm : {m p : Nat} -> Embedding (su m) p -> Embedding m p suEm (Em & Q) = (\ k T -> Em k (El' T)) & \ k T -> Q k (El' T)
The embedding function is always an iteration of El’ constructors, and exactly because decoding El’ just invokes the lower level decoder, the underlying types always match on the nose. The trivial idEm embedding gets us from p to p, and the suEm operation is just what we need to move the source of the embedding down one layer, stripping off an El’.
Now we can work with level p sets decomposed as a zipper of iterated El’ constructors drilling down to level m, say, together with a level m set, guaranteeing that the version in focus and the lifted version mean exactly the same.
(Observe that _===_ is horribly intensional. But it’s far too big (in Set1!) to ever be expressible in our universe.)
The trouble is that we often need the same value in two places at once. This piece of kit helps us build level p props about level n stuff.
COE : {S T : Set} -> S === T -> S -> T COE Refl s = s upPi' : {k : Ki}{n p : Nat}(E : Embedding n p) -> (S : set ^ n) -> (EL n S -> k ^ p) -> k ^ p upPi' E S T = Pi' (Em E set S) (\ s' -> T (COE (Nop E set S) s'))
I also need some kit for building level p conjunctions and implications.
[_]_*_ : {k : Ki}(n : Nat) -> k ^ n -> k ^ n -> k ^ n [ _ ] P * Q = Sg' P \ _ -> Q [_]_=>_ : (n : Nat) -> prop ^ n -> prop ^ n -> prop ^ n [ _ ] P => Q = Pi' (Prf' P) \ _ -> Q
So we get to build
SETEQ : (m : Nat)(S : set ^ m)(n : Nat)(T : set ^ n) -> (p : Nat)(mp : Embedding m p)(np : Embedding n p) -> prop ^ p
I’ll give you the edited highlights, clipped out of an ugly pile of strictness-wrangling. The El’ payoff is at the tail end, shunting El’ into the ‘zipper’:
SETEQ ze (El' ()) n T p mp np SETEQ (su m) (El' S) n T p mp np = SETEQ m S n T p (suEm mp) np SETEQ m S ze (El' ()) p mp np SETEQ m S (su n) (El' T) p mp np = SETEQ m S n T p mp (suEm np)
For Pi-types, we have
SETEQ m (Pi' S S') n (Pi' T T') p mp np = [ p ] SETEQ n T m S p np mp * upPi' np T \ t -> upPi' mp S \ s -> [ p ] VALEQ n T t m S s p np mp => SETEQ m (S' s) n (T' t) p mp np
with a contravariant twist that sets up the way coercion works. Sigma is similar, but covariant in both places.
I gave this componentwise definition for indexed W,
SETEQ m (Wi' I S P r i) n (Wi' J T Q u j) p mp np = [ p ] SETEQ m I n J p mp np * [ p ] VALEQ m I i n J j p mp np * upPi' mp I \ i -> upPi' np J \ j -> [ p ] VALEQ m I i n J j p mp np => [ p ] SETEQ m (S i) n (T j) p mp np * upPi' mp (S i) \ s -> upPi' np (T j) \ t -> [ p ] VALEQ m (S i) s n (T j) t p mp np => [ p ] SETEQ n (Q j t) m (P i s) p np mp * upPi' np (Q j t) \ q -> upPi' mp (P i s) \ o -> [ p ] VALEQ n (Q j t) q m (P i s) o p np mp => VALEQ m I (r i s o) n J (u j t q) p mp np
which is quite a palaver, but it’s just the componentwise thing for five rather dependent components. I saved a bit of work by sharing the quantifiers wherever possible. It occurs to me to wonder whether I really need to insist that the index sets coincide, as the fact is never used in coercion. But the basic ‘why try harder’ plan is just to insist on structural equality and orientate the equations according to the needs of coercion.
It’s worth mentioning.
SETEQ m (Prf' P) n (Prf' Q) p mp np = PROPEQ m P n Q p mp np PROPEQ : (m : Nat)(P : prop ^ m)(n : Nat)(Q : prop ^ n) -> (p : Nat)(mp : Embedding m p)(np : Embedding n p) -> prop ^ p PROPEQ m P n Q p mp np = [ p ] ([ p ] Em mp prop P => Em np prop Q) * ([ p ] Em np prop Q => Em mp prop P)
Equality of propositions is mutual implication: we have proof irrelevance anyway, so we don’t need to worry about preserving representation. Proofs are run-time interchangeable because you replace one smoking hole in the ground where a term used to be with another.
Meanwhile, value equality is a bit neater. We need
VALEQ m Two' tt n Two' ff p mp np = B' ff VALEQ m Two' ff n Two' tt p mp np = B' ff
We’ll have extensionality
VALEQ m (Pi' S S') f n (Pi' T T') g p mp np = upPi' mp S \ s -> upPi' np T \ t -> [ p ] VALEQ m S s n T t p mp np => VALEQ m (S' s) (f s) n (T' t) (g t) p mp np
For indexed W, we define a recursive helper:
VALEQ m (Pi' S S') f n (Pi' T T') g p mp np = VALEQ m (Wi' I S P r i) s n (Wi' J T Q u j) t p mp np = WEQ i s j t where WEQ : (i : EL m I)(s : EL m (Wi' I S P r i)) (j : EL n J)(t : EL n (Wi' J T Q u j)) -> prop ^ p WEQ i (s < f) j (t < g) = [ p ] VALEQ m (S i) s n (T j) t p mp np * upPi' mp (P i s) \ p -> upPi' np (Q j t) \ q -> WEQ (r i s p) (f p) (u j t q) (g q)
To compare props or sets as values, just embed the relevant notion from the level below.
VALEQ ze (U' _) () n (U' _) t p mp np VALEQ (su y) (U' _) s ze (U' _) () p mp np VALEQ (su m) (U' set) S (su n) (U' set) T p mp np = SETEQ m S n T p (suEm mp) (suEm np) VALEQ (su m) (U' prop) P (su n) (U' prop) Q p mp np = PROPEQ m P n Q p (suEm mp) (suEm np)
Meanwhile, El’-stripping goes as before
VALEQ ze (El' ()) s n T t p mp np VALEQ (su m) (El' S) s n T t p mp np = VALEQ m S s n T t p (suEm mp) np VALEQ m S s ze (El' ()) t p mp np VALEQ m S s (su n) (El' T) t p mp np = VALEQ m S s n T t p mp (suEm np)
Coercion has this type
coe : (m : Nat)(S : set ^ m)(n : Nat)(T : set ^ n) -> (p : Nat)(mp : Embedding m p)(np : Embedding n p) -> EL p (Prf' (SETEQ m S n T p mp np)) -> EL m S -> EL n T
following the same zipper-style management of El’. The code is basically a bloodbath. Lots of it is spent telling untrue equations to go away. Even more of it is spent negotiating the fact that shunting values up and down the El’-zipper does sod all: I didn’t quite manage to persuade the with-notation to handle that, but I might try harder, another time. It wasn’t my idea of fun, but I bashed it out.
So what?
What we have here is the basis of a shallow embedding of the Epigram 2 kernel in Agda, up to a point. The definitional equality of Epigram 2 (with stuff like proof irrelevance) is rather more generous than the Agda model can offer, but the propositional equality theory is the same in both cases. The plan is that every Epigram 2 kernel term can be expressed in the Agda model, at the cost of inserting explicit appeals to coercion where once definitional equality sufficed. Fortunately, coercion by plausible equations always computes under constructors, so the extra coercions don’t interfere with the basic βδι-reduction (ordinary computation by function application, definition expansion and pattern matching). We should thus be able to conclude that either Epigram 2 evaluation is terminating, or that Agda’s in trouble too.
The coercion implementation is conspicuously lazy in the proof of equality, except when refuting falsehoods. Correspondingly, the only way we can lose canonicity is if the system is inconsistent. But everything we’re doing here can be modelled (less computationally) in Extensional Type Theory, so consistency seems quite likely.
It was bloody as hell, but I’ve modelled a cumulative hierarchy of extensional universes. The real deal isn’t nearly so bloody, but this construction gives some clue as to how to argue that the system we’re going for might actually make sense.