Building an interpreter: Conditional execution

Mar 24, 2025

Until now, all statements and expressions we’ve added to our bytecode virtual machine have been compiled into unconditional instructions, executed one after the other. This has prevented us from controlling the flow of our program. In this post, we will introduce control flow statements, if-else, logical and/or, while and for statements, that will enable conditional execution of instructions.

These statements are relatively easy to implement; they require just the following new opcodes:

A conditional forward jump - JUMP_IF_FALSE
An unconditional forward jump - JUMP
An unconditional backward jump - LOOP

These opcodes take an operand, the jump target, which is an offset from the jump instruction to where execution should continue. At the heart of each of the control flow statements is a combination of these jump instructions to control the point at which our bytecode interpreter picks up executing instructions. We'll start with the if-else statement.

if-else statements

If statements conditionally execute different branches of code based on the value of expressions within them. Our interpreter will achieve this using the JUMP_IF_FALSE and JUMP instructions. Let’s see how the if statement is compiled.

As usual, we define the if_statement method below that will handle compiling the if statement.

pub fn if_statement(&mut self, i: &Interpreter){
    let loop_start  = self.chunk.as_ref().unwrap().code.len();
    self.consume(LeftParen, "Expect '(' after 'if'.");
    self.expression(i);
    self.consume(RightParen, "Expect '(' after 'if'.");
    let then_jump = self.emit_jump(OpJumpIfFalse);
        
    self.statement(i);
    let jump = self.emit_jump(OpJump);
    self.patch_jump(then_jump);

    if self.match_token(vec![TokenType::Else]){
        self.statement(i);
    }
     
    self.patch_jump(jump);
}

The if statement is of the form if expression then block else block. A compiled if statement is of the general form shown in figure 1.0.

The process for compiling the if statement is as follows.

First, compile the if statement's condition expression.
Emit a JUMP_IF_FALSE instruction with a placeholder as its jump target. This instruction will skip the if block if the condition from above is false. We have to use a placeholder as the target because we do not yet know where the code should jump to.
Compile the instructions within the if’s then block.
Emit a JUMP instruction with a placeholder for it target. If an else block exists, this instruction will skip over it. A placeholder is used for the jump target.
Compile statements in the else block if any.
Patch the jump instructions. We use the patch_jump method to replace the placeholders with the actual jump locations. Patching in this case means computing the required offset from the jump instruction to where the interpreter should continue execution and then updating the jump operands. Patching happens at different points within the code - once we know where a jump must end up we, patch that jump statement. The implementation of the patch_jump method is shown below.
```
fn patch_jump(&mut self, offset: usize){
    let jump = self.chunk.as_ref().unwrap().code.len() - offset - 2;
        
    if jump as i16 > i16::MAX {
        panic!("Too much code to jump over!")
     }
     self.chunk.as_mut().unwrap().code[offset] = ((jump >> 8) & 0xff) as u8;
     self.chunk.as_mut().unwrap().code[offset + 1] = (jump & 0xff) as u8;
}
```
The jump target is two bytes wide, large enough for Sox, so what the patch_jump method shows here is how we go about extracting the first and second bytes of our operand using bit manipulation in Rust.

To finish up, we add the JUMP and JUMP_IF_FALSE statements to our eval loop, as shown below, and implement helpers to disassemble them.

OpCode::OpJumpIfFalse => {
    let offset = read_short!(self);
    let val = peek_stack!(self);
    if !val.try_into_rust_bool(i) {
        self.ip += offset as usize;
    }
},
OpCode::OpJump => {
    let offset = read_short!(self);
    self.ip += offset as usize;
},

Logical Operators

Logical operators like and and or use a technique called "short-circuiting" to optimize execution. This means they can conditionally skip parts of the expression, directly affecting the program's flow. This is also implemented using JUMPS.

and Operator

Sox and expressions are of the form left expression and right expression. If the left side of this expression is false, the value of the entire expression is false, so the right expression is not evaluated. We use a JUMP_IF_FALSE instruction to skip the right expression and go directly to the next instruction after the and expression. This is shown in figure 2.0.

To compile an and statement, we add a parse rule for the and expressions to our parser table. The and operator is an infix operator so we define a parse rule with an infix and_ method for our expression parser table. This compiles the and expression and is shown below.

pub fn and_(&mut self, i: &Interpreter) {
    let end_jump = self.emit_jump(OpJumpIfFalse);
    self.emit_byte((OpPop, None));
        
    self.parse_with_precedence(Precedence::And, i).expect("TODO: panic message");
    self.patch_jump(end_jump);
}

Or Operator

If the left expression of an or expression is true, the entire expression is true, and the right expression is skipped. Conversely, if the left expresson is false, the right expression must be evaluated to determine the value of the whole expression. So when the left expression is false, we jump to instructions for the right expression, taking care to pop the value of the left expression that is still on the stack first. Figure 3.0 illustrates the jump flow for an or expression.

The or_ method for compiling the or statement is implemented in the same fashion as the and_ method and is shown below.

pub fn or_(&mut self, i: &Interpreter) {
    let else_jump = self.emit_jump(OpJumpIfFalse);
    let end_jump = self.emit_jump(OpJump);
        
    self.patch_jump(else_jump);
    self.emit_byte((OpPop, None));

    self.parse_with_precedence(Precedence::Or, i).expect("TODO: panic message");
    self.patch_jump(end_jump);

}

In both logical operators, there is no need to make any change to the eval loop as we already have implemenations for our jump instructions.

while statement

The while_statement method below compiles the while statement. This method goes into the statement method like handlers for other statements.

pub fn while_statement(&mut self, i: &Interpreter){
    let loop_start = self.chunk.as_ref().unwrap().code.len();
    self.consume(LeftParen, "Expect '(' after 'while'.");
    self.expression(i);
    self.consume(RightParen, "Expect ')' after 'while'.");
        
    let exit_jump = self.emit_jump(OpJumpIfFalse);
    self.emit_byte((OpPop, None));
    self.statement(i);
     
    self.emit_loop(loop_start);
    self.patch_jump(exit_jump);
}

Compiling a while statement combines the JUMP_IF_FALSE and LOOP instructions. The loop instruction is analogous to the JUMP instruction but takes a negative value for the jump target, meaning that a loop instruction takes the instruction pointer back to execute code that may have been previously executed. Figure 4.0 is a graphical illustration of compiled instructions for the while statement.

Before compiling the while condition expression that determines if the body of the while loop should execute, we mark the instruction pointer as the loop start; we will loop back to this point to continue execution when applicable.
Next, compile the while expression condition, which will leave a value on the stack after execution.
Next, we emit a JUMP_IF_FALSE statement. This is the exit jump which should jump out of the while statement loop when the condition expression is False.
We emit a POP instruction that should clear the condition expression from the stack when executed.
We then go ahead and compile the body of the while expression.
Next we emit a LOOP statement with the jump target set to the value from #1.
Finally, we patch the JUMP_IF_FALSE statement since we now know where execution should continue.

The next step is to add the loop to our eval loop below and implement helpers for disassembling this instruction.

OpCode::OpLoop => {
    let offset = read_short!(self);
    self.ip -= offset as usize;
}

for Statement

The for statement is the trickiest control flow statement due to the number of components. We must support the initializer, condition, and increment clauses for the for statement. As usual, we define a for_statement method and add it to the statement method. The for_statement method is shown below.

pub fn for_statement(&mut self, i: &Interpreter){
        self.begin_scope(i);
        
        self.consume(LeftParen, "Expect '(' after 'for'").expect("");
        if self.match_token(vec![TokenType::Semi]){
            
        } else if self.match_token(vec![TokenType::Let]){
            self.let_declaration(i);
        } else{
            self.expression_statement(i);
        }
        self.consume(TokenType::Semi, "Expect ';'").expect("");
        let mut loop_start = self.chunk.iter().len();

        let exit_jump = None;
        if (!self.match_token(vec![TokenType::Semi])){
            self.expression(i);
            self.consume(TokenType::Semi, "Expect ';' after variable declaration.").expect("");

            let exit_jump = Some(self.emit_jump(OpJumpIfFalse));
            self.emit_byte((OpPop, None));
        }
        if (!self.match_token(vec![TokenType::RightParen])){
            let body_jump = self.emit_jump(OpJump);
            let increment_start = self.chunk.iter().len();
            self.expression(i);
            self.emit_byte((OpPop, None));
            self.consume(TokenType::RightParen, "Expect ')' after for clauses").expect("");
            
            self.emit_loop(loop_start);
            loop_start = increment_start;
            self.patch_jump(body_jump);
        }
        
        self.consume(TokenType::Semi, "Expect ';'").expect("");
        
        self.consume(RightParen, "Expect ')' after for clauses.").expect("");
    
        self.statement(i);
        self.emit_loop(loop_start);
        if exit_jump.is_some() {
            self.patch_jump(exit_jump.unwrap());
            self.emit_byte((OpPop, None));
        }
        self.end_scope(i)
    }

A for loop can declare a variable that should be scoped to the body of the loop, so the implementation of for_method goes in a begin_scope...end_scope.
For the initializer clause, we can use a variable declaration, an expression or nothing, so we search accordingly, taking care to avoid leaving any value on our stack.
We mark the instruction pointer at this point as the loop_start. When the for statement loops back, it should loop back to here and continue execution.
The condition clause, where present, determines if the for loop should continue executing. We search for a condition clause expression and, where present, compile it with an expression call that leaves the value on the stack after execution.
We then emit an OP_JUMP_IF_FALSE — if the value of the condition expression is false, then this is our exit jump from the loop.
This is followed by a POP instruction to clear the stack.
Next, we have to check for our increment clause. We must perform an intricate set of jumps to compile the increment statement within the for loop. This is because even though the increment precedes the body of the for statement, the increment instruction is executed after the for loop's body is executed. To compile this, we emit a JUMP instruction as the first instruction; this will cause our interpreter to jump to the body of the for loop and execute. Next, we emit instructions for the increment statement. After the increment code section executes, we must execute the conditional clause of the for loop. Hence, we emit a LOOP expression that jumps to just before the conditional expression. Finally, we update the loop_start to point to the start of the increment code section so that the LOOP after the for loop’s body takes execution to the start of the increment code section. Figure 5.0 captures the LOOPs and JUMPs used in the for statement.
Figure 5.0: Jumps in a for statement
The interpreter executes a set of instructions till it get to the increment instruction. It then jump over the increment instructions to the loop's body and executes the body. At the end of the body, it jumps to the start of the increment instruction, which, after executing, jumps to the condition, and the whole cycle continues until we can exit the loop.

With jumps implemented, we now have almost all we need to introduce functions and classes, the most important Sox abstractions. In the next set of posts we will begin to look at these. As usual, all code is in github.

Building with Rust

Discussion about this post