Lower Taxation Representation?

A while ago, I was worrying about the term representation we use, and how much it was costing us. I’ve been thinking, and I’ve got a few ingredients. Let’s take a look at the terms, so far. (Yes, I’m using the Strathclyde Haskell Enhancement.)


> data Tm :: {TmPart, Bool, Nat} -> *     where

Terms are indexed by

  • whether they may stand in the body of a term, or only at the head
  • 
> data TmPart :: *     where
>   Body  :: TmPart
>   Head  :: TmPart

  • whether they are weak-head-normal, or not
  • how many de Bruijn variables are free

Bear in mind this special case: the closed weak-head-normal forms.


> type Wh =  Tm  {Body, True,  Z}

which demands that the term be ready for top-level inspection: you can match on a Wh and you know you can see what it means. You can’t accidentally forget to compute a term to weak-head-normal form and use it where a Wh is demanded. (But you can pattern match on a Tm, so care is still required.)

Back to the terms…


>   Set    ::  Int ->  Tm  {Body, b, n}

This time, it’s stratified! Now, mind this: Set is available whether or not the term is supposed to be weak, and however many binders we’re under. The indices represent requirements, not measurements.


>   B     ::  Bop {n}  ->
>             G :*: {m, n} ->   Tm {Body, b0, S m} ->
>             Tm {Body, b1, n}

OK, this is where it gets a bit peculiar. B is for bind, starting with n free variables. Bop {n} is the type of binders available (Π and Σ with domain annotations, λ without).


> data Bop :: {Nat} -> *    where
>   Pi  ::  Tm {Body, b, n} ->  Bop {n}
>   Sg  ::  Tm {Body, b, n} ->  Bop {n}
>   La  ::                      Bop {n}

The body of the binder is not obliged to be head normal (it’s not at the head!) and it’s got {S m} (at least one) free variables. Between {m} and {n}, there’s an environment, which can be (and often is) Nil, making m=n. Including an environment allows us to suspend the execution of a function until its argument turns up.

Next, the unit type, pairs and void. Not much to see here


>   One   ::  Tm {Body, b, n}
>   (:&)  ::  Tm {Body, b0, n} ->  Tm {Body, b1, n}  ->
>             Tm {Body, b2, n}
>   Void  ::  Tm {Body, b, n}

So far, it’s all been intro forms. Of course, there should be more features, but that’s enough to get the idea. Meanwhile, how do you do anything? I’ve used a spine representation, rather than a type of neutral terms. Why? I want immediate access to the head, so I can see if there’s a redex. A spine is a backward list of eliminators — backwards so we can easily stick more on the end.


>   (:$)  ::  Tm {Head, b, n} -> Bwd (Elim {n})  ->
>             Tm {Body, b, n}

I’ll get to heads in a moment. Here, spines are made of applications and projections.


> data Bwd x = B0 | Bwd x :< x deriving  (Show,  Eq)
>
> data Elim :: {Nat} -> *   where
>   A   :: Tm {Body, b, n} -> Elim {n}
>   Hd  :: Elim {n}
>   Tl  :: Elim {n}

Now, here’s a possibility we didn’t have before. As long as you’re not obliged to produce a head-normal form, why shouldn’t you have a closure? We can choose to suspend evaluation outside of head position.


>   (:/)  ::  Tm {Body, b, m} ->  G :*: {m, n}  ->
>             Tm {Body, False, n}

At the head of a spine, you can’t have anything you like, but hopefully you can have what you need: free variables, bound variables and… weak-head-normal forms?


>    R     ::  Ref {b} ->  Tm {Head, b,  n}
>    V     ::  Fin {n} ->  Tm {Head, False, n}
>    W     ::  Wh ->       Tm {Head, False, n}

Er, um, check this first: a free variable is a context reference, either a declaration or a definition. Both carry a type, but a definition also has a value and is definitely not weak-head-normal.


> data Ref :: {Bool} -> *   where
>   Dcl  :: Nom ->        Wh -> Ref  {b}
>   Dfn  :: Nom -> Wh ->  Wh -> Ref {False}

Neither is a bound variable weak-head-normal (for it must be looked up in the environment). Meanwhile, if you’re implementing an operator which gets a bunch of Wh values and has to compute something, you’d want to be able to do stuff with those values: as long as you don’t claim you’re making a weak-head-normal form, you can.

Let’s get the hang of environments now. Just checking our numbers


> data Nat :: *    where
>   Z  :: Nat
>   S  :: Nat -> Nat

and finite sets, used as de Bruijn indices since last century.


> data Fin :: {Nat} -> *    where
>   Fz  :: Fin {S n}
>   Fs  :: Fin {n} -> Fin {S n}

Well now, environments are an instance of :*:, the relational Kleene star. (Some people have started calling these ‘thrists’, but I don’t see why they need a new name.)


> data (:*:) :: ({i, i} -> *) -> ({i, i} -> *)   where
>   Nil   :: r :*: {i, i}
>   (:>)  :: r {i, j} -> r :*: {j, k} -> r :*: {i, k}

That’s to say, environments are made of little bits of environment extension:


> data G :: {Nat, Nat} ->  *   where
>   G    :: Wh -> G {S n, n}
>   Nix  :: G {Z, n}

You can grow the context, but you can also Nix it, throwing away all the values. This type’s negotiable: as long as you can explain the action of each chunk.


> (/+/) :: G :*: {i, j}  -> G :*: {j, k} -> G :*: {i,  k}
> Nil         /+/ gs   = gs
> (Nix :> _)  /+/ _    = Nix :> Nil
> (g :> gs)   /+/ gs'  = g :> (gs /+/ gs')

Let’s also make sure we can project. Of course, you know that if you have a variable in your hand, the corresponding environment can’t have been Nixed.


> (/!) :: G :*: {n, Z}  -> Fin {n} ->  Wh
> (G v :> _)  /! Fz = v
> (G _ :> g)  /! Fs i = g /! i

Let’s evaluate, given an environment that goes all the way down. For canonical stuff, we’re weak-head-normal already, so we just close up underneath the head constructor, stashing the (shared) environment.


> (//) :: Tm {p, b, n}  -> G :*: {n, Z} ->  Wh
> Set i           // _   = Set i
> B (Pi s) g' t   // g   = B (Pi (s :/ g)) (g' /+/ g) t
> B (Sg s) g' t   // g   = B (Sg (s :/ g)) (g' /+/ g) t
> B La g' t       // g   = B La (g' /+/ g) t
> (s :& t)        // g   = (s :/ g) :& (t :/ g)
> One             // _   = One
> Void            // _   = Void

Otherwise, we might have some work to do. Bound vars, we fish from the environment; definitions, we expand; eliminations we must actually compute (see below). For closures, we unpack the stashed environment, stitch it into place, and carry on computing. Note that always the indexing keeps us honest: we can’t leave a de Bruijn variable dangling; we can’t unpack a closure and discard its environment because only that environment has the correct starting index to let us evaluate the term!


> V i             // g   = g /!   i
> R (Dfn _ v _)   // _   = v
> R (Dcl x s)     // _   = R (Dcl x s) :$ B0
> (f :$ es)       // g   = (f // g) $$ (es, g)
> (t :/ g')       // g   = t // (g' /+/ g)
> W t             // g   = t

How to compute. One step goes like this. We provide an environment, because we haven’t evaluated the spine yet.


> elim ::  G :*: {n, Z} -> Wh -> Elim {n} ->  Wh
> elim g (B La g' b)  (A a)  = b // (G (a // g) :> g')
> elim g (f :$ es)    (A a)  = f :$ (es :< A (a :/ g))
> elim g (a :& b)     Hd     = a // Nil
> elim g (f :$ es)    Hd     = f :$ (es :< Hd)
> elim g (a :& b)     Tl     = b // Nil
> elim g (f :$ es)    Tl     = f :$ (es :< Tl)

Careful with the β rule! We evaluate the argument a in its environment, then bung the result into the stashed environment, making it ready (at last!) to evaluate b. Try mixing the environments up or leaving them out and see if you can get past GHC.

Now, to do the whole spine, we need


> ($$) ::  Wh -> (Bwd (Elim {n}), G :*: {n, Z}) ->  Wh
> n $$ (es, g) = bfold (elim g) n es
>
> bfold ::  (x -> y -> x) -> x -> Bwd y -> x
> bfold f x B0         = x
> bfold f x (ys :< y)  = f (bfold f x ys) y

And that's us computing! When I get a moment, I'll show you my cheap trick (with coaching from Saizan) for writing


< la$ \ f -> la$ \ x ->   f (f x)

instead of


< B La Nil (B La Nil
<     (V 1 :$ (B0 :< A (V 1 :$ (B0 :< A (V 0 :$ B0))))))

After that, I suppose I’d better show how to check types for these things, with an environment rather than substitutions all over the shop.

But it comes to this — a first-order representation of terms (so hole-filling does not require an expensive quotation process); weak-head-normal forms rather than full normalization; scope-safe environment management; the ability to insist on weak-head-normal form by type.

Leave a Reply