Defunctionalization

4 min readMar 6, 2023

Defunctionalization is a compile-time step that reduces higher-order functions by a single first-order apply function. We can use Closure Conversion, an algorithm for converting local and anonymous functions into Closures. It does convert free variables into environment-bound variables, like:

let a = 1;
let f = (b) => {
  a + b
};

f(2);

That transforms into the following code when compiling:

let a = 1;
let f = (env, b) => {
  env.a + b
};

f({a}, 2);

This is necessary, because some targets don’t support nested functions, and global functions are objectively faster than nested and local functions.

Defining the algorithm

The algorithm consists of substitution by the environment. The formula is:

The arguments are removed from the environment and intersect with the free variables set, resulting in removing all free variables that are not bound to the environment.

We take all the variables, and we transform the inner lambdas into a closure tuple, that is defined by the following pseudocode:

type Closure = {
  function: FunctionPtr,
  environment: Map<string, string>
}

The code will be represented as a lifted lambda highlighted in bright red.

The light blue represents the closure converted, with the name function, with the environment parameter being the star above
The intermediate blue represents the access within the environment

This is a pseudocode just to represent the first run of the algorithm, this will run at every closure in the environment, until it's finished, but here in the blog post it would be giant if expanded, so we will enter in details in the next step.

Writing the algorithm

First of all, we need to define the tree as a simple algebraic data type:

use im::{hashset, HashMap, HashSet};

#[derive(Debug, Clone, PartialEq, Eq)]
enum Lifted {
    Yes,
    No,
}

#[derive(Debug, Clone, PartialEq, Eq)]
enum Term {
    Ref(String),
    Sym(String),
    Lam(Lifted, Vec<String>, Box<Term>),
    App(Box<Term>, Box<Term>),
    Closure(HashMap<String, Term>, Box<Term>),
}

This is the lambda calculus extended with closures, and we need to define a function that will check the free variables on a Term:

Free variables are variables that aren’t bound to any parameter

fn free_vars(term: &Term) -> HashSet<String> {
    match term {
        Term::Ref(_) => hashset![],
        // If it's a symbol, it's a free variable
        Term::Sym(name) => hashset![name.clone()],
        Term::App(head, tail) => free_vars(head).union(free_vars(tail)),
        Term::Lam(_, args, body) => {
            free_vars(body).relative_complement(args.iter().cloned().collect())
        }
        Term::Closure(env_values, body) => env_values
            .iter()
            .flat_map(|entry| free_vars(entry.1))
            .chain(free_vars(body))
            .collect(),
    }
}

And the following step is to define the closure convert function, that will hold the previously defined algorithm:

fn convert(mut env: HashSet<String>, subst: HashSet<String>, term: Term) -> Term {
    match term {
        // If the symbol is in the substitution, it will be converted to ref
        Term::Sym(name) if subst.contains(&name) => Term::Ref(name),
        // If the lambda is already lifted, we just need to run the conversion on 
        // the existing lambda calculus tree.
        Term::Lam(Lifted::Yes, args, body) => {
            // Since the env is cloned on every function call, we can mutate it
            // by adding the arguments, instead of cloning with the new values, and
            // it makes the code more legible.
            for arg in args.iter() {
                env.insert(arg.clone());
            }

            Term::Lam(
                Lifted::Yes,
                args,
                Box::new(convert(env.clone(), subst.clone(), *body)),
            )
        }
        // If the lambda isn't lifted, we need to convert it now
        Term::Lam(Lifted::No, mut args, body) => {
            for arg in args.iter() {
                env.insert(arg.clone());
            }

            let fv = free_vars(&body);
            let closure_refs = env
                .iter()
                .cloned()
                .collect::<HashSet<String>>()
                // The environment subtracts the arguments' symbols
                // {x, y} - {y} = {x} ->
                .relative_complement(args.iter().cloned().collect())
                // And mantain the symbols that exists on free variables
                // {x} ∩ {x} ->
                .intersection(fv.clone());

            // We use the {x}, that is bound to the environment, and substitute
            // in the first match case.
            let body = Box::new(convert(env.clone(), closure_refs.clone(), *body));

            let term = if closure_refs.is_empty() {
                Term::Lam(Lifted::Yes, args, body)
            } else {
                // Here, we build a env values hash map just to make compiling easier
                // so we don't need to do that directly on the target.
                let env_values = closure_refs
                    .iter()
                    .map(|name| (name.clone(), Term::Sym(name.clone())))
                    .collect::<HashMap<String, Term>>();
                
                // Every closure needs an `env` argument.
                args.push("#env".to_string());

                Term::Closure(env_values, Box::new(Term::Lam(Lifted::Yes, args, body)))
            };

            // We run again to convert to convert the children.
            convert(env, subst, term)
        }
        Term::App(head, tail) => Term::App(
            Box::new(convert(env.clone(), subst.clone(), *head)),
            Box::new(convert(env.clone(), subst.clone(), *tail)),
        ),
        _ => term,
    }
}

And, voilá, this runs ok!

let term = Term::Lam(
    Lifted::No,
    vec!["x".to_string()],
    Box::new(Term::Lam(
        Lifted::No,
        vec!["y".to_string()],
        Box::new(Term::App(
            Box::new(Term::Sym("x".to_string())),
            Box::new(Term::Sym("y".to_string())),
        )),
   )),
);

println!("{:?}", convert(hashset![], hashset![], term));
// Lam(Yes, ["x"], Closure({"x": Sym("x")}, Lam(Yes, ["y", "#env"], App(Ref("x"), Sym("y")))))

Credits

~ Chiyoku (https://github.com/algebraic-sofia)

Defunctionalization

Defining the algorithm

Writing the algorithm

Credits

Written by Gabi