Building an interpreter: Classes ...or Types

Types, and then more types

Jul 19, 2024

We have added built-in types such as integers, strings, functions etc to our interpreter but to make our language more interesting, we must give users the ability to define their own types. Our language definition allows for this functionality and we implement it now.

The class statement is used to create new user types. These types can be arranged in a hierarchy to implement inheritance thus adding a new level of expressivity to our interpreter. The snippet below is an example of a class hierarchy that our interpreter should support.

class Test{

    init(){
        print "running init in test";
    }

    my_test(){
        return 1;
    }
 }
 
 class AnotherTest: Test {
 
    init(){
        super.init();
        print "running init";
        this.name = "this is a test attr";
    }
 
    my_test(){
        let i = super.my_test();
        print i;
        print this.name + " here";
        return 10;
        
    }
 }

In the snippet above, the AnotherTest class inherits from the Test class and overrides the my_test and init methods. The overridden methods both invoke the procedures defined in the super-class when called. This is the simple kind of inheritance we want our user-defined types to support.

The support we have to add for these user-defined types is similar to that of our built-in types. We need to

Implement the type for our user-defined types - just like built-in types have types, user-defined types must also have types. This will hold all methods that our type implements.
Implement visitor functions for class and associated statements in our interpreter type.
Implement a type for instances of user class definitions - when we call our user defined type, we create instances which in our interpreter are just data containers. This type represents those instances.
Implement basic method inheritance for our user defined types.

Representing user defined types

First on our todo list - how do we represent user-defined types? To answer this we take a step back and ask - what functionality should our type representation offer? Like any other language that supports user-defined types, whatever representation we choose must be able to

Initialize a user defined type instance when called,
Hold method definitions for the user defined type so instances can call such methods,
Hold a reference to super class (if one exists) so that it can handle inheritance.

The Type type that we defined earlier on for use with our built-in types provides all the above-listed functionality so we will use it to represent our user-defined types. First, we need to make small changes to the Type type definition to support user-defined types. This change introduces another method that we call to initialise the type when creating a user-defined type. The existing new method for creating static instances of Type is renamed to new_static_type and another method named new,shown below, is implemented for user defined types. For now, the only major difference between this method and the new_static_type method is that we initialise the name of the type in this method.

pub fn new<T: ToString>(
        name: T,
        base: Option<SoxRef<SoxType>>,
        methods: HashMap<String, SoxMethod>,
        slots: SoxTypeSlot,
        attributes: SoxAttributes,
    ) -> Self {
        let typ = Self {
            base,
            methods,
            slots,
            attributes,
            name: Some(name.to_string()),
        };
        typ
    }

Another difference is how we store method definitions for user-defined types. To simplify our implementation, we store our type’s methods as attributes on Type so we look in the attributes field when resolving methods.. The base field will hold an optional reference to the super-class for inheritance purposes.

Finally, the Type type needs a call implementation that yields instances when called via the call syntax, (). This call implementation below serves this purpose.

 pub fn call(fo: SoxObject, args: FuncArgs, interpreter: &mut Interpreter) -> SoxResult {
        if let Some(to) = fo.as_type() {
            let instance = SoxInstance::new(to.clone());
            let initializer = to.find_method("init".into());
            let instance = instance.into_ref();
            let ret_val = if let Some(init_func) = initializer {
                let func = init_func
                    .as_func()
                    .expect("init resolved to a non function object");
                let bound_method = func.bind(instance.clone(), interpreter)?;
                SoxFunction::call(bound_method, args, interpreter)?;
                Ok(instance)
            } else {
                Ok(instance)
            };
            ret_val
        } else {
            let error = Exception::Err(RuntimeError {
                msg: "first argument to this call method should be a type object".to_string(),
            });
            return Err(error.into_ref());
        }
    }

As part of the call procedure, it checks that an init method has been defined for our type and where so, it will call that method to initialise the instance. Looking at the implementation we see that before calling any init method that it finds, the method is first bound to the instance; we will see why this is necessary in subsequent sections.

SoxInstance

The result of invoking a user defined type’s call method is an instance of that type but what does an instance really mean and how do we represent it? Once again, thinking about the functionality that such a data structure should support gives us an idea of how we should represent it. An instance must -

Hold data related unique to that instance,
Be able to invoke its type’s method which can manipulate data unique to just that instance that is invoking the method,
Be able to get and set attributes either from its own data or from its type.

#1 means that instances must have some kind of data structure to hold data unique to them and #2 means that instances must some how hold a reference to the type that created them; this way they can query the type for methods when needed.

We implement a SoxInstance as a struct with two fields

typ - this holds a reference to the type that created the instance
fields - this is a RefCell reference to a hashmap that will hold the instance’s data. We use a RefCell here so that it can be update from multiple references.

#[derive(Clone, Debug)]
pub struct SoxInstance {
    typ: SoxRef<SoxType>,
    fields: RefCell<HashMap<String, SoxObject>>,
}

impl SoxInstance {
    pub fn new(class: SoxRef<SoxType>) -> Self {
        let fields = HashMap::new();
        Self {
            typ: class,
            fields: RefCell::new(fields),
        }
    }

    pub fn set(&self, name: Token, value: SoxObject) {
        self.fields.borrow_mut().insert(name.lexeme.into(), value);
    }

    pub fn get(inst: SoxRef<SoxInstance>, name: Token, interp: &mut Interpreter) -> SoxResult {
        if let Some(field_value) = inst.fields.borrow().get(name.lexeme.as_str()) {
            return Ok(field_value.clone());
        }

        if let Some(method) = inst.typ.find_method(name.lexeme.as_str()) {
            if let Some(func) = method.as_func() {
                let bound_method = func.bind(SoxObject::TypeInstance(inst.clone()), interp);
                return bound_method;
            } else {
                return Err(Interpreter::runtime_error(format!(
                    "Found property with same name, {}, but it is not a function",
                    name.lexeme
                )));
            }
        }

        Err(Interpreter::runtime_error(format!(
            "Undefined property - {}",
            name.lexeme
        )))
    }
}

To get and set attributes (attributes include methods here), our implementation for SoxInstance has get and set methods defined. Both are straightforward methods. When we invoke the get method to get an attribute, the instance -

Return attribute if referenced name is in the field map,
If the attribute is not in the field map, it searches for a method with the same name in its type and returns a bound instance of that method if it finds one. We will come back to this search in a bit.
If no attribute or method is found then it returns an exception.

The set implementation is just a simple method that takes field name and a value then sets it on the given instance.

Binding methods to Instances and This

Once we find a method referenced by a name, we still have to bind this to the instance from which we call that method so that it can be used to manipulate the data of just that instance. For example, when we initialize our instances via the init method, we may want to set attribute values on that instance. To achieve this we use this keyword and set such attributes on whatever object the keyword references. Here, it references our instance.

To achieve this, first, we introduce the notion of a bound_method, which is a function that has been associated with a class instance. Before we use a referenced method that is called from an instance, we must bind the function to that instance and then call the bound method. Binding a function to an instance is all about creating a new environment in which we assign this keyword to the instance we are calling from. We then create a new function using this new environment and return it so that it can be called. We can now reference the instance from which it has been called within the method. To support this, we modify our function type to include a new bind method implemented as shown below.

pub fn bind(&self, instance: SoxObject, interp: &mut Interpreter) -> SoxResult {
        if let SoxObject::TypeInstance(_) = instance {
            let environment = interp.referenced_env(self.environment_ref); 
            let mut new_env = environment.clone();
            let namespace = Namespace::default();
            new_env
                .push(namespace)
                .expect("Failed to push namespace into env.");
            new_env.define("this", instance);

            let env_ref = interp.envs.insert(new_env);
            let new_func = SoxFunction {
                declaration: self.declaration.clone(),
                environment_ref: env_ref,
                is_initializer: false,
            };
            return Ok(new_func.into_ref()); 
        } else {
            Err(Interpreter::runtime_error(
                "Could not bind method to instance".to_string(),
            ))
        }
    }

Inheritance

We are almost done with implementing a basic user-defined type system. The outstanding part is the nuts and bolts of our inheritance, in this case, a method inheritance system.

We want an inheritance system in which when an instance can not find a method in its type, it queries its supertype and so on till it finds such a method or throws an exception. As previously discussed, methods for a given type are stored in a hashmap on the type so getting a method is just a case of searching that map for the method. To support inheritance, we make that search recursively starting with the current type and going up the type hierarchy as in the find_method implementation below. That is all there is to our inheritance for now. Our interpreter will recursively search up the type hierarchy for a method implemenation.

 pub fn find_method(&self, name: &str) -> Option<SoxObject> {
        self.attributes
            .get(name)
            .cloned()
            .or_else(|| self.base.as_ref().and_then(|base| base.find_method(name)))
    }

This aspect of inheritance is pretty straightforward but what if we want to explicitly call an implementation from a supertype? In that case, we invoke the method via the super keyword. This keyword should reference the supertype within our class context if such exists so when creating our type object in the visit_class_stmt, we bind the super keyword to the superclass if one is provided.

Visiting the nodes

To round up, we have to implement the

functions that handle visiting the nodes we have discussed above.

First, we implement the visit_class_stmt below to handle visiting class statments and creating the user-defined type objects.

fn visit_class_stmt(&mut self, stmt: &Stmt) -> Self::T {
        let ret_val = if let Stmt::Class {
            name,
            superclass,
            methods,
        } = stmt
        {
            let mut base = None;

            // get super class if exists
            let sc = if superclass.is_some() {
                let c = superclass.as_ref().unwrap();
                let sc = self.evaluate(c);
                if let Ok(SoxObject::Type(v)) = sc {
                    info!("Evaluated to a class");
                    base = Some(v);
                } else {
                    let re = Interpreter::runtime_error("Superclass must be a class.".to_string());
                    return Err(re);
                }
            };
            let none_val = { self.none.clone().into_ref() };
            let active_env = self.active_env_mut();
            active_env.define(name.lexeme.to_string(), none_val);

            let prev_env = self.active_env_ref.clone();
            // setup super keyword within namespace
            if superclass.is_some() {
                let env_ref = {
                    let active_env = self.active_env();
                    let mut env_copy = active_env.clone();
                    let namespace = Namespace::default();
                    env_copy.push(namespace)?;
                    let env_ref = self.envs.insert(env_copy);

                    env_ref
                };
                self.active_env_ref = env_ref;

                let sc = self.evaluate(superclass.as_ref().unwrap());
                if let Ok(SoxObject::Type(v)) = sc {
                    let env = self.referenced_env(env_ref);

                    env.define("super", SoxObject::Type(v.clone()))
                }
            }

            let mut methods_map = HashMap::new();
            //setup methods
            for method in methods.iter() {
                if let Stmt::Function { name, body, params } = method {
                    let func = SoxFunction {
                        declaration: Box::new(method.clone()),
                        environment_ref: self.active_env_ref.clone(),
                        is_initializer: name.lexeme == "init".to_string(),
                    };
                    methods_map.insert(name.lexeme.clone().into(), func.into_ref());
                }
            }

            // set up class in environment
            let class_name = name.lexeme.to_string();
            let class = SoxType::new(
                class_name.clone(),
                base,
                Default::default(),
                Default::default(),
                methods_map,
            );
            self.active_env_ref = prev_env;
            let active_env = self.active_env_mut();
            active_env.assign(class_name, class.into_ref())?;

            Ok(())
        } else {
            let err =
                Interpreter::runtime_error("Calling a visit_class_stmt on non class type.".into());
            return Err(err);
        };
        ret_val
    }

The visit_class_stmt function does the following -

Evaluates any super type defined,
Where a super type is defined we have to create a new namespace for our class to hold the super keyword.
Create a function object for each function defined in our class statement
Create class object with data from #1-#3
Create an entry for this class object in our currently active env.

Next up, we implement the visit_get_expr and vist_set_expr expressions for accessing attributes as shown below.

 fn visit_get_expr(&mut self, expr: &Expr) -> Self::T {
        let ret_val = if let Expr::Get { name, object } = expr {
            let object = self.evaluate(object)?;
            if let SoxObject::TypeInstance(inst) = object {
                info!("Instance of type {:?}", inst.class(self));

                SoxInstance::get(inst, name.clone(), self)
            } else {
                Err(Interpreter::runtime_error(
                    "Only class instances have attributes".into(),
                ))
            }
        } else {
            Err(Interpreter::runtime_error(
                "Calling visit_get_expr on none get expr".into(),
            ))
        };
        ret_val
    }

    fn visit_set_expr(&mut self, expr: &Expr) -> Self::T {
        let ret_val = if let Expr::Set {
            name,
            object,
            value,
        } = expr
        {
            let object = self.evaluate(object)?;
            if let Some(mut v) = object.as_class_instance() {
                let value = self.evaluate(value)?;

                v.set(name.clone(), value.clone());
                Ok(value)
            } else {
                Err(Interpreter::runtime_error(
                    "Only instances have fields".into(),
                ))
            }
        } else {
            Err(Interpreter::runtime_error(
                "Calling visit_set_expr on none set expr".into(),
            ))
        };
        ret_val
    }

This is followed by the visitor function, visit_this_expr, for handling this expression.

fn visit_this_expr(&mut self, expr: &Expr) -> Self::T {
        if let Expr::This { keyword } = expr {
            let value = self.lookup_variable(keyword, expr);
            value
        } else {
            Err(Interpreter::runtime_error(
                "Calling visit_this_expr on none this expr".into(),
            ))
        }
    }

Last is the visit_super_expr that handles the super expression as shown below.

 fn visit_super_expr(&mut self, expr: &Expr) -> Self::T {
        if let Expr::Super { keyword, method } = expr {
            let env = self.active_env_mut();
            let super_type = env.get("super")?;
            let instance = env.get("this")?;

            let method = if let SoxObject::Type(v) = super_type {
                let c = v;
                let method_name = method.lexeme.clone();
                let method = c.find_method(method_name.as_str());
                let t = if let Some(m) = method {
                    if let Some(func) = m.as_func() {
                        let bound_method = func.bind(instance, self)?;
                        Ok(bound_method)
                    } else {
                        Err(Interpreter::runtime_error(format!(
                            "Undefined property {}",
                            method_name
                        )))
                    }
                } else {
                    Err(Interpreter::runtime_error(format!(
                        "Undefined property {}",
                        method_name
                    )))
                };
                t
            } else {
                Err(Interpreter::runtime_error(
                    "Unable to resolve instance - this".into(),
                ))
            };
            method
        } else {
            Err(Interpreter::runtime_error(
                "Calling visit_super_expr on none super expr".into(),
            ))
        }
    }

This wraps up a basic implementation of our user-defined types - we can now define types that inherit from other user-defined types and invoke methods defined in our user-defined types. This point marks the completion of a basic implementation of the Sox language that we specified at the start.

As usual, you can find the source code up this point on github.

Building with Rust

Discussion about this post