Let-aliasing in Typed Racket

:: Typed Racket

Although Typed Racket (TR) can currently typecheck a large number of common idioms found in Racket code, reasoning about direct and indirect usages of aliasing has escaped its grasp... until now!

Motivation for Aliasing

Typed Racket’s use of logical propositions about types enables it to typecheck programs whose control flows don’t neatly fit within the realm of what traditional type systems can check.

Here, for example, the type checker recognizes the result of (number? x) has type-related implications, whether it evaluates to #t or #f:

(: foo (Any -> Number))
(define (size x)
  (cond
    [(number? x) x]
    [else 42]))

That’s great–but if we introduce just a little indirection:

(: foo-let (Any-> Number))
(define (foo-let x)
  (let ([y x])
    (cond
      [(number? y) x]
      [else 42])))
;; does not typecheck in Racket version <= 6.1.1  =(

We discover that, even though y was bound to x, that information does not make it into the type-system. Instead, the only information that relates the two values is (more or less) the following three propositions:

  • y is of type Any

  • If y evaluates to #f, x is of type False.

  • If y evaluates to a non-#f value, x is not of type False.

Now, you may be thinking foo-let doesn’t typecheck because no one should ever write such a silly program! However, many of Racket’s useful macros expand into code which requires just this sort of type-based reasoning:

(: foo-match (Any -> Number))
(define (foo-match x)
  (match x
    [(? number?) x]
    [_ 42]))
;; does not typecheck in Racket version <= 6.1.1
;; because match expands into a program utilizing let-bindings

It sure would be nice if we could get some of these seemingly simple programs (and more!) to typecheck without a big refactoring of the type system...

A slight aside: What is an object in TR?

In the calculus which describes Typed Racket’s type system, an ’object’ is a syntactic representation of an expression. If an expression ‘has an object,’ then there is some pure syntactic representation for the value it will evaluate to. Currently, objects can represent variables or accesses into immutable values such as cons cells or structs. For example:

Racket Expression

 

TR Object

x

 

x

(random 100)

 

No Object

(caar p)

 

(car (car p))

((λ (x) x) y)

 

y

(string-length s)

 

No Object

((λ (x) y) 42)

 

y

Basically, objects enable the type system derive the same logical meaning from expressions like (number? ((λ () x))) or ((λ (a) (number? a)) x) as it does from (number? x).

A Simple Solution: Let-aliasing Objects

In desire to keep things simple and maintain compatibility with what Typed Racket already does so well, I decided to explore adding a simple aliasing extension to the current type system.

Let-aliasing overview

My goal was to implement the following changes:

  • Add a function θ to the type environment which maps identifiers to objects. By default, θ just maps identifiers to themselves.

  • θ is extended when a let-expression binds an expression with an object to a variable. This extension is only in effect while checking the body of that let-expression.

  • When typechecking any variable expression, x, pretend your considering the object o (where o = (θ x)) instead.

So, for example, when typechecking the body of the let-expression in foo-let, we extend θ with the mapping (y -> x) instead of adding the three propositions relating x and y we saw earlier:

(: foo-let (Any-> Number))
(define (foo-let x)
  (let ([y x])  ;; <- we extend θ, mapping 'y' to the object 'x',
    (cond              ;; making references to 'y' be viewed
      [(number? y) x]  ;; as references to 'x' here within
      [else 42])))     ;; the body of the let

Type Judgments

Another way to describe this change is to observe the changes to the type judgments affecting let-expressions and variables.

Here are the original typing judgments (in a PLT Redex-ish format) from ‘Logical Types for Untyped Languages’ by Tobin-Hochstadt and Felleisen [ACM link]

[(Proves Γ_1 (x_1 -: τ_1))
 ------------------------ "T-Var"
 (TypeOf Γ_1 x_1 τ_1 (x_1 -! False) (x_1 -: False) x_1)]
 
[(TypeOf Γ_1 e_0 τ_0 ψ_0+ ψ_0- o_0)
 (where Γ_2 (cons (And (x_0 -: τ_0)
                       (Or (And (x_0 -! False) ψ_0+)
                           (And (x_0 -: False) ψ_0-)))
                  Γ_1))
 (TypeOf Γ_2 e_1 τ_1 ψ_1+ ψ_1- o_1)
 ------------------------ "T-Let"
 (TypeOf Γ_1
         (let ([x_0 e_0]) e_1)
         τ_1  [o_0 / x_0]
         ψ_1+ [o_0 / x_0]
         ψ_1- [o_0 / x_0]
         o_1  [o_0 / x_0])]

Where (TypeOf Γ e τ ψ ψ o) is a 5-place relation with the following arguments:

  • Γ is the current type environment.

  • e is the expression being typechecked.

  • τ is the type of e.

  • The first ψ is what we learn if e evaluates to be a non-#f value.

  • The second ψ is what we learn if e evaluates to #f.

  • o is the object of e.

(Proves Γ ψ) is a 2-place relation which holds when the environment of propositions Γ can prove the proposition ψ.

(x_1 -: τ_1) and (x_1 -! τ_1) are propositions which mean x_1 is or is not of some type τ_1, respectively.

[o_0 / x_0] placed next to something means to substitute o_0 for x_0 within that something.

Here are the let-aliasing versions that replace those rules (note we add a place for θ next to Γ):

[(where o_x (lookup θ_1 x_1))
 (Proves Γ_1 (o_x -: τ_1))
 ------------------------ "T-Var-Alias"
 (TypeOf θ_1 Γ_1 x_1 τ_1 (o_x -! False) (o_x -: False) o_x)]
 
[(TypeOf θ_1 Γ_1 e_0 τ_0 ψ_0+ ψ_0- o_0)
 (where #f (equal? o_0 null))
 (where θ_2 (extend θ_1 x_0 o_0))
 (TypeOf θ_2 Γ_2 e_1 τ_1 ψ_1+ ψ_1- o_1)
 ------------------------ "T-Let-Alias"
 (TypeOf θ_1 Γ_1 (let ([x_0 e_0]) e_1) τ_1 ψ_1+ ψ_1- o_1)]
 
;; this one is the same as T-Let above but e_0 is required
;; to have a null object (i.e. not have an object)
[(TypeOf θ_1 Γ_1 e_0 τ_0 ψ_0+ ψ_0- null)
 (where Γ_2 (cons (And (x_0 -: τ_0)
                       (Or (And (x_0 -! False) ψ_0+)
                           (And (x_0 -: False) ψ_0-)))
                  Γ_1))
 (TypeOf θ_1 Γ_2 e_1 τ_1 ψ_1+ ψ_1- o_1)
 ------------------------ "T-Let-No-Alias"
 (TypeOf θ_1
         Γ_1
         (let ([x_0 e_0]) e_1)
         τ_1  [null / x_1]
         ψ_1+ [null / x_1]
         ψ_1- [null / x_1]
         o_1  [null / x_1])]

This simple approach allows the type system to seamlessly track let-aliasing by only slightly modifying the type system’s behavior for let-expressions and variables!

With this change, TR can now successfully typecheck a variety of new programs!

(: super-foo (-> Any Number))
(define super-foo
  (λ (x)
    (match x
      [(? number?) x]
      [`(_ . (_ . ,(? number?))) (cddr x)]
      [`(_ . (_ . ,(? pair? p)))
        (if (number? (caddr x))
            (car p)
            41)]
      [_ 42])))
;; typechecks with let-aliasing!

Additionally, because this approach prevents several propositions from being generated (and disjunctions no less!) some programs which took quite a while to typecheck can now be verified immediately!

Epilogue

So isn’t this just copy propagation?

It’s definitely similar — both reason about known equalities between expressions and have to be aware of mutation (our use of objects handles this) — however, we’re not modifying the source program as a separate pass. We’re keeping the same program and just trying to edify the typechecker (without adding additional passes) with the same kinds of insights it could have if it was typehecking a program that had been rewritten using copy propagation or a similar technique.

Are there any simple programs you’re still working to typecheck?

Sure! This simple guy doesn’t typecheck:

(: size ((U Number String) -> Number))
(define (size x)
  (match x
    [(? number?) x]
    [_ (string-length x)]))

Because it expands into something like this:

(define (size-expanded x)
  (let* ([x1 x]
         [f2 (λ () (string-length x))])
    (if (number? x1)
        x
        (f2))))

Let-aliasing almost gets us there, but the environment in which f2 is typechecked doesn’t know that we’re only going to call it if x1 is not a number =(

Perhaps this is an argument for why let-aliasing should be a little more like copy propagation...? We’ll have to do some more digging and find out!

So how much of the Typed Racket codebase had to change to support this?

Not that much! Here’s the Github pull request.

I had to improve TR’s ability to update type information inside of structured types a little (e.g. cons cells, structs) since we no longer had extra variables floating around when we did things like (let ([x (car p)]) ...).

Now when two propositions like (x -: (Pairof Number Any)) and (x -: (Pairof Any Number)) are joined, we structurally recur into the car and cdr of the type and get the resulting fact (x -: (Pairof Number Number)), which was essential to typechecking programs which use aliases instead of new variables.

These changes were fairly natural/minor, however, and probably should have been made at some point even without aliasing. Aliasing just brought the matter front and center.

Edits: A few typos

Update: What if I don’t want aliasing?!

This change, in addition to allowing aliasing, assumes it is the desired default behavior when typechecking. It was pointed out that with this change, there are a few programs which used to typecheck but no longer do:

(: adhoc-min (All (A) (-> A A A)))
(define (adhoc-min a b)
  (let ([a* a]  [b* b])
    (cond
      [(and (integer? a) (integer? b))
       (if (< a b)  a* b*)]
      [else a*])))

This program errors with the following message: Type Checker: type mismatch. Expected: A. Given: (U Integer Inexact-Real) in: a*.

This is because previously the absence of aliasing prevented occurrences of a* or b* from having their types updated by type tests on a or b. If you wanted a* or b* to be unnaffected by type tests on a or b, you can simply add explicit annotations at the let-binding site:

(: adhoc-min2 (All (A) (-> A A A)))
(define (adhoc-min2 a b)
  (let ([a* : A a]  [b* : A b])
    (cond
      [(and (integer? a) (integer? b))
       (if (< a b)  a* b*)]
      [else a*])))

or use local defines to create a* and b*:

(: adhoc-min3 (All (A) (-> A A A)))
(define (adhoc-min3 a b)
  (define a* a)
  (define b* b)
  (cond
    [(and (integer? a) (integer? b))
     (if (< a b)  a* b*)]
    [else a*]))