Building a language interpreter: The Execution Environment.

Everything happens in a context.

Jan 29, 2024

Take the simple for loop shown below, we define an variable, i, mutate it and print the value referenced by it.

for (let i=0; i < 10; i=i+1){
    print i;
}

For this to execute, we require the variable to be stored somewhere it can be referenced for subsequent use. Take the following example too. We defined the variable, a, to reference a value of 1.

let a = 1;
for (let a = 10; a < 20; a=a+1){
    print a;
}

We go into our loop and then define a looping variable with the same name, a, that we use in the for loop. Given that our language is lexically scoped, the interpreter should differentiate between both names so the wrong values are not updated.

All these mean that our interpreter should be able to -

Keep a mapping of identifier to values that they map to.
Maintain some kind of hierarchy of mappings of identifiers to values.
Provide a means for updating the objects that those identifiers map to.

Modelling our environment

As usual, we will look to the simplest implementation that satisfies our requirements at this point.

Namespaces

The namespace is our basic unit for maintaining mappings of identifiers to values. We represent a namespace using a hashmap that maps string identifiers to SoxObjects. Our implementation of this is shown below.

#[derive(Clone, Debug)]
pub struct Namespace {
    pub bindings: HashMap<String, SoxObject>,
}

impl Default for Namespace {
    fn default() -> Self {
        debug!("Creating new namespace in current environment");
        let bindings = HashMap::new();
        Self { bindings }
    }
}

impl Namespace {
    pub fn define<T: Into<String>>(
        &mut self,
        name: T,
        value: SoxObject,
    ) -> Result<(), RuntimeException> {
        self.bindings.insert(name.into(), value);
        Ok(())
    }

    pub fn assign<T: Into<String>>(
        &mut self,
        name: T,
        value: SoxObject,
    ) -> Result<(), RuntimeException> {
        self.bindings.insert(name.into(), value);
        Ok(())
    }

    pub fn get<T: AsRef<str>>(&mut self, name: T) -> Result<SoxObject, RuntimeException> {
        let ret_val = if let Some(v) = self.bindings.get(name.as_ref()) {
            Ok(v.clone())
        } else {
            Err(RuntimeException {
                msg: "".into(),
            })
        };
        ret_val
    }
}

Since our namespace is basically a mapping, it supports the basic operations of a map, get, define and assign that perform operations to fetch, create and assign values to an identifier respectively. One small detail here is the fact that we distinguish between defining a variable and assigning to a variable so if one were to attempt to assign to a non existent variable it would fail. This will prove useful later on.

Notice how we use traits - AsRef<str>, Into<String> in some of our method definitions. This makes our methods more flexible - for example the trait Into<String> means that we can pass any type that implements that trait as argument to our method so we can call our method with a String or a str.

Our environment - a hierarchy of namespaces

We model our environment as a stack of namespaces. This provides some hierarchy for our mappings. The implementation of our environment is shown below -

#[derive(Clone, Debug)]
pub struct Env {
    namespaces: Vec<Namespace>,
}


impl Default for Env {
    fn default() -> Self {
        Self {
            namespaces: vec![Namespace::default()],
        }
    }
}

impl Env {
    pub fn define<T: Into<String>>(&mut self, name: T, value: SoxObject) {
        let _ = self.namespaces.last_mut().unwrap().define(name, value);
    }

    pub fn get<T: Into<String> + Display>(&mut self, name: T) -> Result<SoxObject, RuntimeException> {
        let name_literal = name.into();
        for namespace in self.namespaces.iter_mut().rev() {
            if let Ok(value) = namespace.get(name_literal.as_str()) {
                return Ok(value.clone());
            }
        }

        return Err(RuntimeException {
            msg: format!("Undefined variable {}.", name_literal),
        });
    }

    pub fn assign<T: Into<String> + Display>(&mut self, name: T, value: SoxObject) -> Result<(), RuntimeException> {
        let name_literal = name.into();
        for namespace in self.namespaces.iter_mut().rev() {
            if let Ok(_) = namespace.get(name_literal.as_str()) {
                namespace.assign(name_literal, value)?;
                return Ok(());
            }
        }

        return Err(RuntimeException {
            msg: format!("Variable {} not defined in curr env.", name_literal),
        });
    }

    pub fn new_namespace(&mut self) -> Result<(), RuntimeException> {
        let namespace = Namespace::default();
        let _ = self.namespaces.push(namespace);

        Ok(())
    }

    pub fn pop(&mut self) -> Result<(), RuntimeException> {
        self.namespaces.pop();
        Ok(())
    }

    pub fn size(&self) -> usize {
        self.namespaces.len()
    }
}

As we will see later on when we get to our interpreter, each time we go into a block, we push a new namespace onto the environment stack and when we exit the block, we pop the namespace from the stack. This way, our interpreter can handle the snippet we saw above where we had multiple definitions of the same identifier in different blocks.

Updating objects in our environment

The get method of our environment struct reproduced below returns a reference to a value in our environment.

 pub fn get<T: Into<String> + Display>(&mut self, name: T) -> Result<SoxObject, RuntimeException> {
        let name_literal = name.into();
        for namespace in self.namespaces.iter_mut().rev() {
            if let Ok(value) = namespace.get(name_literal.as_str()) {
                return Ok(value.clone());
            }
        }

        return Err(RuntimeException {
            msg: format!("Undefined variable {}.", name_literal),
        });
    }

Recall that objects are wrapped in Rc so when get a reference to an object in our environment, we call clone to increase the reference count on that object so that multiple references to that object can exist. As we implement more complex object types, we will see the importance of this.

That is all there is to our environment at the point. See the tag, sox-environment, for our updated code.

So far, we have been writing about various independent parts that make up our interpreter; next, we will look at how we put the together into a basic interpreter.

Building with Rust

Discussion about this post