Intermezzo: A simple optimization

Compute it just once.

Aug 29, 2024

One possible area of improvement for our interpreter is the process of resolving variables. This resolution algorithm searches the environment for variables used in our script. Recall that the environment is just a hierarchy of namespaces containing name-to-object bindings. The interpreter resolves a variable by searching each namespace in the hierarchy till it finds the variable and raises an error otherwise. The interpreter does this each time it reads or writes a variable; this repeated process is highly inefficient. If we manipulate a variable twenty times within a block, we have to walk the namespace hierarchy the same number of times repetitively. The example snippet below is a concrete example of this issue.

def test() {
    let i = 0;
    while (i < 20) {
        let j = 0;
        while (j < 20) {
            let k = 0;
            while (k < 20) {
                if (i < 50) {
                    let t = i+j+k;
                    print k;
                }
                k = k + 1;
            }
            j = j+1;
        }
        i = i+1;
    }
}

Each time i, j or k is read, the interpreter has to find the namespace they are defined in by looping through each namespace in our environment - repetitive work that ideally should be done just once.

One way to avoid these repetitive steps is to introduce a step that caches the location of each variable’s namespace so that it can be accessed in constant time during the execution of the AST. To achieve this, we introduce a resolver step into our interpreter pipeline. This comes after the parsing step but just before interpretation. The Resolver type handles this resolution step for us. It implements the same visitor pattern just like our interpreter and parser.

The Resolver type

The resolver maps the occurrence of a variable on a given line to the index of the namespace in the environment where it is bound. For example, in the code snippet from above, the resolver resolves the variable, i, to the namespace at index 1 (the namespace with an index of 0 is the global namespace). To keep things simple, the resolver only works for local variables bound in functions or classes. The implementation of the resolver class is available here. The resolver walks the code AST and each time it encounters a variable, or the this or super keywords, it stores the index of the namespace it was encountered and the line number so that this information can be used during the interpretation stage. The Resolver type is shown below.

pub struct Resolver {
    scopes: Vec<Vec<(String, bool)>>,
    current_function: FunctionType,
    current_class: ClassType,
    resolved_data: HashMap<(String, usize), (usize, usize)>,
}

The resolver works like this -

The entry point to the resolver is the resolve method defined below that visits all statements in the AST in order.

pub fn resolve(
        &mut self,
        statements: &Vec<Stmt>,
    ) -> Result<HashMap<(String,usize), (usize, usize)>, ResolverError> {
        for stmt in statements {
            self.resolve_stmt(stmt.clone())?;
        }
        let res = self.resolved_data.clone();
        Ok(res)
    }

The resolver recursively walks AST using the visit_* pattern.
When the resolver encounters a block, it initializes a scope, a table of variable names to boolean values. This is analogous to how we create a namespace when we encounter a block during the execution of our AST and a scope has one-to-one mapping to the namespace in our interpreter.
When the resolver encounters a variable declaration, it inserts the variable into the scope with a false value using the declare method. This is immediately followed by a call to the define method that sets the value to true. These two calls help ensure that a variable does not reference itself.
When the resolver encounters a variable, or the this or super keywords assignments, it does a local resolution as defined below.
```
pub fn resolve_local(&mut self, expr: Expr, name: Token) -> Result<(), ResolverError> {
        for (dist_index, scope) in self.scopes.iter_mut().enumerate().rev() {
            for idx in 0..scope.len() {
                let val = scope.get_mut(idx);
                if val.unwrap().0 == name.lexeme.as_str() {
                    self.resolved_data
                        .insert((name.lexeme.to_string(), name.line), (dist_index, idx));
                }
            }
        }
        Ok(())
    }
```
This is the interesting part of the resolver and the main part. The method is responsible for computing the location of the variable in the list of scopes - this is equivalent to computing the location of the namespace where a variable exists. In addition to computing which namespace a variable exists in, it also computes an array index that tells us the variable’s index in the namespace if the namespace is backed by an array rather than a map, we will come to this later on.
At the end of this process, the resolver returns a copy of the resolved_data field to the caller.

Interpreter changes

The entry point to the interpreter is changed to accommodate the new resolver step in the process. This comes immediately after the parsing as shown below.

pub fn run(source: String, enable_var_resolution: bool) {
    let tokens = Lexer::lex(source.as_str());
    let mut parser = Parser::new(tokens);
    let mut var_resolver = Resolver::new();

    let ast = parser.parse();

    let mut interpreter = Interpreter::new();

    if ast.is_ok() {
        if enable_var_resolution {
            let resolved_data = var_resolver.resolve(&ast.as_ref().unwrap());

            interpreter.locals = resolved_data.unwrap();
        }
        interpreter.interpret(&ast.unwrap())
    }
}

The enable_var_resolution flag enables us to toggle between performing variable resolution or not.

We also modify the Interpreter type to include a field, local of type HashMap<(String, usize), (usize, usize)> that will be used in the process of name resolution. The locals variable holds data returned by the resolver. Finally, we also modify the two main functions that handle the variable lookup or assignment -

lookup_variable
visit_assign_expr

These will now check the locals map for information on a variable and then query the environment using that additional information. The new implementation of lookup_variable below is illustrative.

fn lookup_variable(&mut self, name: &Token, _expr: &Expr) -> SoxResult {
        if let Some(locals) = &self.locals {
            if let Some(dist) = locals.get(&(name.lexeme.to_string(), name.line)) {
                let (dst, idx) = dist;
                let active_env = self.envs.get_mut(self.active_env_ref).unwrap();
                let key = EnvKey::NameIdxPair((name.lexeme.to_string(), *dst, *idx));
                active_env.get(&key)
            } else {
                let global_env = self.global_env_mut();
                let key = EnvKey::Name(name.lexeme.to_string());
                let val = global_env.get(&key);
                val
            }
        } else {
            let active_env = self.active_env_mut();
            let key = EnvKey::Name(name.lexeme.to_string());
            let val = active_env.get(&key);
            val
        }
    }

Environment and Namespace changes

The Env and Namespace types are updated to make use of the new information that comes from the name resolution step. The get and assign methods of both types are modified to make use of the indexes that have been calculated during the name resolution process. During the name resolution process, the index of a variable within the block where it is defined has also been computed so we modify our Namespace to make use of this new information too by changing the bindings field to a vector type as below.

#[derive(Clone, Debug)]
pub struct Namespace {
    pub bindings: Vec<(String, SoxObject)>,
}

Now once the interpreter fetches the Namespace via its index, it uses the additional index information to pull out the correct binding from the bindings table.

Benchmarking the name resolution optimisation

To assess the impact of our change, we set up a small benchmark using the criterion,a third party rust library for benchmarking.

Our benchmark compare the time it takes to execute the snippet below with and without variable resolution.

def test() {
    let i = 0;
    while (i < 20) {
        let j = 0;
        while (j < 20) {
            let k = 0;
            while (k < 20) {
                if (i < 50) {
                    let t = i+j+k;
                    print k;
                }
                k = k + 1;
            }
            j = j+1;
        }
        i = i+1;
    }
}

The implementation of this benchmark is available on the optimisation branch of the repository. criterion outputs the nice graphs below to show how the benchmark results.

The benchmark results visualized in the graphs above show that using our name resolver provides a clear improvement over the alternative.

Building with Rust

Discussion about this post