Enhancing our object system with types

Any sufficiently advanced piece of technology is indistinguishable from magic...

May 25, 2024

Functions would have been the next topic of discussion, but completing the visit_call_expr function which handles object calls exposes the unergonomic nature of our object system; our code must explicitly check that an object is a function before proceeding. This means that whenever we have another object that supports calling, we must change the code to include that type. Although functional, it is less than ideal. I wanted something more elegant, and CPython's type system provided good inspiration. The way to handle this is for an object to have a type that defines what a call means for objects of that type. To call an object, we query its type for a call implementation, and where there is none, we know such an object does not support calling. In this solution, the methods in an object’s type determine its capabilities.

First, we make the following enhancements to our SoxObject to make it more ergonomic -

We switch from using the Rc<T> reference type to the SoxRef<T> generic type to represent references to our Sox types. A benefit of this switch is that it eliminates a lot of code repetition.
We define Deref and Clone traits for our SoxObject to make using our Sox types with generics easier.
We modify our SoxObject to include methods for getting Sox types from a SoxObject rather than using the payload! macro. For example, to get the integer value payload of a SoxObject, we use the as_int method shown below.

 pub fn as_int(&self) -> Option<SoxRef<SoxInt>> {
     match self {
         SoxObject::Int(v) => Some(v.clone()),
         _ => None,
     }
 }

Next, we implement a type, SoxType, that holds type information about our objects.

SoxType

The SoxType is a simple container of type information that makes it easier to use our SoxObjects. For example, we can obtain almost all information about an object’s capabilities by querying the type for its methods. The SoxType struct is defined below.


pub type SoxNativeFunction = dyn Fn(&Interpreter, FuncArgs) -> SoxResult;

#[derive(Clone)]
pub struct SoxMethod {
    pub func: &'static SoxNativeFunction,
}

pub type SoxAttributes = HashMap<String, SoxObject>;
pub type GenericMethod = fn(SoxObject, FuncArgs, &mut Interpreter) -> SoxResult;

#[derive(Debug)]
pub struct SoxTypeSlot {
    pub call: Option<GenericMethod>,
}

#[derive(Debug)]
pub struct SoxType {
    pub base: Option<SoxTypeRef>,
    pub methods: HashMap<String, SoxMethod>,
    pub slots: SoxTypeSlot,
}

SoxType is a simple struct with the following fields:

a. base - The Sox interpreter supports single inheritance, and this field references the parent type. Inheritance functionality will be implemented later on.

b. methods - This field maps the type's method names to method objects.

c. slots - This concept is copied from the Python Interpreter. This will hold methods that implement functionality that are not necessarily attributes of a type. For example, the implementation of the call functionality gets put into a slot as it implements what happens when an object is called using our call syntax.

Bootstrapping a SoxType

A SoxType is a built-in type that users never directly initialize, providing us with a lot of flexibility in how we create them. We choose to statically create every SoxType when the interpreter starts up. This makes lifetime management easier because all the types have a static lifetime. The interpreter struct has been modified to include a field, types, which is a library of all built-in types. Initializing this field during interpreter initialization creates all the built-in types as shown below.


#[derive(Debug)]
pub struct TypeLibrary {
    pub bool_type: &'static SoxType,
    pub float_type: &'static SoxType,
    pub int_type: &'static SoxType,
    pub str_type: &'static SoxType,
    pub none_type: &'static SoxType,
    pub exception_type: &'static SoxType,
    pub func_type: &'static SoxType,
}

impl TypeLibrary {
    pub fn init() -> Self {
        Self {
            bool_type: bool_::SoxBool::init_builtin_type(),
            float_type: float::SoxFloat::init_builtin_type(),
            int_type: int::SoxInt::init_builtin_type(),
            str_type: string::SoxString::init_builtin_type(),
            none_type: none::SoxNone::init_builtin_type(),
            exception_type: exceptions::Exception::init_builtin_type(),
            func_type: function::SoxFunction::init_builtin_type(),
        }
    }
}

The init method above shows that each Sox type (e.g SoxInt, SoxString etc and not to be confused with SoxType) is responsible for creating its SoxType. This means that the methods and slots that go in a SoxType struct come from the actual Sox type such as SoxInt, SoxString etc. A slot is just a struct that contains several fields which are all functions with this signature - fn(SoxObject, FuncArgs, &mut Interpreter) -> SoxResult. In our case at the moment we have only defined the call slot which we will see in more detail when we discuss implementing functions - as we progress, we will add more slots. Sox types do the heavy lifting of populating the methods and slots of a given SoxType and must implement the StaticType and SoxClassImpl traits if they are going to be used to initialize a SoxType.

init_builtin_type(), a member of the StaticType trait shown, handles creating instances of SoxType.

pub trait StaticType {
    const NAME: &'static str;
    fn static_cell() -> &'static OnceCell<SoxType>;
    fn init_builtin_type() -> &'static SoxType
    where
        Self: SoxClassImpl,
    {
        let typ = Self::create_static_type();
        let cell = Self::static_cell();
        cell.set(typ)
            .unwrap_or_else(|_| panic!("double initialization of {}", Self::NAME));
        let v = cell.get().unwrap();
        v
    }

    fn create_slots() -> SoxTypeSlot;
    fn create_static_type() -> SoxType
    where
        Self: SoxClassImpl,
    {
        let methods = Self::METHOD_DEFS;
        let slots = Self::create_slots();
        SoxType::new(
            None,
            methods
                .iter()
                .map(move |v| (v.0.to_string(), v.1.clone()))
                .collect::<HashMap<String, SoxMethod>>(),
            slots,
        )
    }

    fn static_type() -> &'static SoxType {
        Self::static_cell()
            .get()
            .expect("static type has not been initialized")
    }
}

The first step in this process is to create an instance of the SoxType struct via a call to create_static_type. In the create_static_type method, methods from the Self::METHOD_DEFS field defined in the SoxClassImpl trait shown below are used to populate the methods field of the SoxType struct.

pub trait SoxClassImpl {
    const METHOD_DEFS: &'static [(&'static str, SoxMethod)];
}

The slots for a given type are also created here by each type. For example, in our language, we can call classes or functions the same way but each call behaviour is type-dependent. Once our static type is ready, we create a static cell where our type will live. One interesting thing I learnt is how Rust handles static variables. Take the implementation of static_cell below.

fn static_cell() -> &'static OnceCell<SoxType> {
        static CELL: OnceCell<SoxType> = OnceCell::new();
        &CELL
    }

The static CELL is handled as if it were globally defined so once initialised, the same instance will always be used.

Populating a type’s methods

Returning to the METHOD_DEFS; METHOD_DEFS is a static array of SoxMethods. This is the list of methods a type supports, and every type implementing the SoxClassImpl trait provides this for its type. For example, our bool type defines a bool method, and its METHOD_DEFS is an array of just one SoxMethod which has been created from the bool method using the static_func method as below.

impl SoxClassImpl for SoxBool{
    const METHOD_DEFS: &'static [(&'static str, SoxMethod)] = &[
        ("bool", SoxMethod{ func: static_func(SoxBool::bool)})
    ];
}

If there was another method in our bool class, regardless of the number or type of argument, we would just add another entry using static_func and add this to the array. However, this begs a question, the central question of this section, how is it possible to use static_func with different types? This is relevant because there is no single function type in Rust. Each function item type in Rust is uniquely identified by its name, its type arguments, and its early-bound lifetime arguments so how do we get through this - handling heterogeneous types as if they are homogeneous? The obvious way to do this in Rust is to represent each of our methods as trait objects so we define the SoxMethod and the types that go with it below.

pub type SoxNativeFunction = dyn Fn(&Interpreter, FuncArgs) -> SoxResult;

#[derive(Clone)]
pub struct SoxMethod {
    pub func: &'static SoxNativeFunction,
}

#[derive(Clone, Debug)]
pub struct FuncArgs {
    pub args: Vec<SoxObject>,
}

This implementation we describe here takes inspiration from RustPython. The func field in the SoxMethod is an Fn trait object that takes a reference to an interpreter and FuncArgs (we will see how we use this later), a struct for passing a collection of arguments.

We still have to get from functions that have signatures such as fn(&SoxObject), fn(&SoxObject, SoxObject), etc. to an instance of our SoxMethod and to do this, we need to convert our methods to implementations of Fn(&Interpreter, FuncArgs). Rust traits and trait bounds also help here. The static_func below that generates our SoxMethod is implemented so that its param is trait bound by the NativeFn trait.

pub const fn static_func<Kind, R, F: NativeFn<Kind, R>>(f: F) -> &'static SoxNativeFunction {
    std::mem::forget(f);
    F::STATIC_FUNC
}

This trait bound restricts (or in this case expands?) the range of possible parameters to the static_func to any type that implements the NativeFn trait shown below.

pub trait NativeFn<K, R>: Sized + 'static {
    fn call(&self, i: &Interpreter, arg: FuncArgs) -> SoxResult;
    
    const STATIC_FUNC: &'static SoxNativeFunction = {
        if std::mem::size_of::<Self>() == 0 {
            &|i, args| {
                let f = unsafe{ std::mem::MaybeUninit::<Self>::uninit().assume_init() };
                f.call(i, args)
            }
        } else {
            panic!("function must be zero-sized to access STATIC_FUNC")
        }
    };
}

The other part of the puzzle works because functions in Rust can implement traits. So for each function type that we have in a Sox type, we implement the NativeFn trait so that the function can be used as a parameter to static_func.

The STATIC_FUNC block in the NativeFn trait is an interesting one; when the block is referenced by the static_func call, it returns a closure that implements the Fn(&Interpreter, FuncArgs) trait. The check, std::mem::size_of::<Self>() == 0, is done for safety reasons because we initialize the function using MaybeUninit. By checking that Self has a size of zero, we know that there is no uninitialized data in Self. It may seem odd for an item to have a size of 0 but Rust has the concept of zero-sized types such as function items. See the function item reference for more information.

So how do we go about implementing our NativeFn trait for our functions? This is best illustrated by an example such as below.

 pub fn bool_(&self) -> Self {
     self.clone()
 }

The bool method here is a function type that takes a single borrowed parameter, self. An implementation of the NativeFn trait for the function above of the form below.

pub struct BorrowedParam<T>(PhantomData<T>);
pub struct OwnedParam<T>(PhantomData<T>);

impl<F, S, R> NativeFn<(BorrowedParam<S>,), R> for F
where
    F: Fn(&S) -> R + 'static,
    S: FromArgs,
    R: ToSoxResult,
{
    fn call(&self, i: &Interpreter, mut args: FuncArgs) -> SoxResult {
        let (zelf,) = (args.bind::<(S,)>(i)).expect("Failed to bind arguments");
        (self)(&zelf).to_sox_result(i)
    }
}

The S type parameter is a tuple of parameters that eventually get passed to the underlying method. Notice BorrowedParam which is generic over S. If we used just S and later wanted an implementation of this trait for a function type that takes an owned parameter as the sole argument, the compiler would raise an issue and indicate that we are implementing a trait for a type twice even though the F type parameter is of a different type. The trick to get around this is to use a different type for each of our implementations. In this case, we use the generic BorrowedParam<S>. When working with owned type arguments we use the OwnedParam<S> generic type. We do not care about the BorrowedParam or OwnedParam types, we care only about the S type that they are generic over. We could have used any other type, such as Vec<S>, instead.

Our implementation above introduces several new concepts. Two of these traits, FromArgs and ToSoxResult are below.

pub trait FromArgs: Sized {
    fn from_args(i: &Interpreter, args: &mut FuncArgs) -> SoxResult<Self>;
}

pub type SoxResult<T = SoxObject> = Result<T, SoxObject>;

pub trait ToSoxResult: Sized {
    fn to_sox_result(self, i: &Interpreter) -> SoxResult;
}

ToSoxResult is a trivial trait that enables uniform handling of return types. FromArgs is also simple but a little bit trickier. Types that implement FromArgs can reconstitute themselves from the FuncArgs struct which we previously mentioned. When calling the func field of a SoxMethod, a FuncArgs struct that contains all required arguments for the underlying function call is passed to the field object. In the case of our bool method above, the FuncArgs struct will contain self, the object we are calling the method on. These have to be bound to method params before the method is called and this is where the FromArgs trait comes in. Referring to the line below which is from the implementation of the NativeFn from above.

let (zelf,) = (args.bind::<(S,)>(i)).expect("Failed to bind arguments");

args is a FuncArg instance while (S,) is a tuple of our function parameters, in this case, it is just a single parameter. The FuncArgs struct defines the bind method that calls from_args on the type that is passed to it. The type passed to it is a tuple of some size n so we must define implementations of FromArg for all tuple sizes that we wish to support. Each implementation for a given tuple size also then calls from_args on each type in the tuple and returns the values that they return in a tuple. This implementation is shown below.

impl<A: FromArgs> FromArgs for (A,) {
    fn from_args(i: &Interpreter, args: &mut FuncArgs) -> SoxResult<Self> {
        Ok((A::from_args(i, args)?,))
    }
}

impl<A: FromArgs, B: FromArgs> FromArgs for (A, B) {
    fn from_args(i: &Interpreter, args: &mut FuncArgs) -> SoxResult<Self> {
        Ok((A::from_args(i, args)?, B::from_args(i, args)?))
    }
}

impl<A: FromArgs, B: FromArgs, C: FromArgs> FromArgs for (A, B, C) {
    fn from_args(i: &Interpreter, args: &mut FuncArgs) -> SoxResult<Self> {
        Ok((
            A::from_args(i, args)?,
            B::from_args(i, args)?,
            C::from_args(i, args)?,
        ))
    }
}

impl<T: TryFromSoxObject> FromArgs for T {
    fn from_args(i: &Interpreter, args: &mut FuncArgs) -> SoxResult<Self> {
        let v = args.args.iter().take(1).next().unwrap().clone();
        T::try_from_sox_object(i, v)
    }
}

The TryFromSoxObject trait exists so that types that implement it can return a type from a given SoxObject via the try_from_sox_object call. The bind method returns a tuple and the members of this tuple are unpacked and assigned to variables that are passed as parameters to the method that we call. I have just described an implementation for a function that takes a single argument but it is similar for functions that take other number of parameters. For example, the implementation of the NativeFn trait for a function that takes two parameters (the first parameter is borrowed and the second is owned) is shown below.

impl<F, S, T, R> NativeFn<(BorrowedParam<S>, OwnedParam<T>), R> for F
where
    F: Fn(&S, T) -> R + 'static,
    S: FromArgs,
    T: FromArgs,
    R: ToSoxResult,
{
    fn call(&self, i: &Interpreter, mut args: FuncArgs) -> SoxResult {
        let (zelf, v1) = (args.bind::<(S, T)>(i)).expect("Failed to bind function arguments.");
        (self)(&zelf, v1).to_sox_result(i)
    }
}

We now have a system for creating object types and we will use this extensively subsequently. All our implemenation is available as usual on the Sox Github repository. Feel free to play with them and report any bugs. With our enhanced object system in place, we will look at extending our built-in types to include functions next.

Building with Rust

Discussion about this post