ISS Development, Bugs Galore!

Hello! Welcome to me working my way through the development of the ISS(Instruction Set Simulator) for RISC-4. Why do we need it? Well I could run blindly into the RTL(Register Transfer Level), what is that? Well that's a great question! It's how we describe a system. Verilog, VHDL, these are the languages we use to do this! So instead of running blindly in and grinding my face against a wall bug after bug after rewrite after fixing a bug in the ISA and so on, you get the idea. We build a simple program in python to help us shake the bugs out here instead of in the RTL! So let's go through all the mistakes I made while writing this, but first let's go quickly over what exactly an ISS does.

The Instruction Set Simulator and You.

So we need to confirm that our ideas set up in the ISA work, I wrote it but I have no idea if it'll work like we expect, hell even my examples might not work like I wrote them. So we have a little program that decodes our instruction, figure out which function it needs to call, passes it along and does the work we defined in the ISA. So! Simple, it's only a couple hundred lines and thats because I'm not the best python developer. There is probably some ways to make it faster and more performant, but that's ok we don't need this to be fast it just needs to prove it works, and it does!! Well so far. We'll get into that later in this post.

Bugs, Bugs, Bugs!

So I said we had some bugs right? Well here is the best part, none of them were in the ISA! Well I mean I have to fix some things with the examples but the actual instructions work!! So bugs. Here is the first one.

#1: I forgot to initialize the flags.

First but? First line of the class definition init.

class RISC4:
  __init__(self):
    self.flag_c = False # Just didn't include them whoops!
    self.flag_z = False

This one is pretty self explanatory, if we don't initialize the flags how can we use them?!

#2: Double incrementing the Program Counter(PC)

I incremented the PC in multiple functions, fetch and execute. I was trying to implement the pipeline and didn't follow through. Turns out, an ISS should just simulate the ISA's behavior, not the pipeline's timing. So I decided in order to get to the development faster I would simply simulate the ISA, we will cross the pipeline bridge when we get the ISA running.

#3: Register r0 was not hardwired to zero

This is very important and broke everything. We implemented the pattern that is familiar for RISC, r0 is supposed to be hardwired to zero so we can easily clear registers and so on. So I wrote the test programs with the expectation that r0 is always zero, and when it isn't then things don't work how they should.

Fix was to simply enforce zero

def write_reg(self, rd, value):
    if rd != 0:  # r0 is hardwired to zero
        self.reg[rd] = value & 0xF

#4: Branch offset was not sign extended

When we tried to jump backwards we jumped to garbage. Why? Because we would underflow while expecting to have a signed value in the offset.

def sign_extend_8bit(value):
    if value & 0x80:  # Negative (bit 7 set)
        return value | 0xFFFFFF00
    return value

This allows us to ensure we sign the value.

#5: Branch offsets were not scaled to nibbles

We needed to ensure the program counter and the offset are encoded the same way. We had the offset in bytes and the PC in nibbles, whoops! That needs to be nibble nibble or byte byte. Well I decided to convert the offset to nibbles so I didn't blow up the rest of the ISS.

offset_nibbles = sign_extend_8bit(offset8) * 4
self.pc = (self.pc + offset_nibbles) & 0xFFF

#6: Forgot to clear the flags on logical operations

I forgot to clear the flags on AND/OR/XOR, I didn't clear the Carry flag. Simple fix but a bug nonetheless.

def exec_and(self, instr):
    # ... compute result ...
    self.flag_c = False  # Logical ops clear carry
    self.flag_z = (result == 0)

#7: SLT didn't give correct answers on signed comparisons

Another signed oopsie, if we passed a negative number to SLT it wouldn't treat it as less. We need to fix it.

def to_signed(value):
    return value if value < 8 else value - 16

def exec_slt(self, instr):
    # ... decode ...
    rs_signed = to_signed(self.reg[rs])
    rt_signed = to_signed(self.reg[rt])
    result = 1 if rs_signed < rt_signed else 0

#8: Memory address calculation was wrong

Load and store functions didn't combine properly and accessed the incorrect address. Forgot to combine the register pair.

def exec_load(self, instr):
    # ... decode ...
    base_addr = (self.reg[base] << 4) | self.reg[base + 1]
    offset_signed = sign_extend_4bit(offset4)
    addr = (base_addr + offset_signed) & 0xFF

#9: Store used the wrong address

I accidentally took the address from rd instead of storing there. Fix is simple, just store to the right place!

def exec_store(self, instr):
    _, rs, base, offset4 = decode_m_type(instr)
    # rs is in bits [11:8] for SW, not rd
    # ... rest of implementation ...

#10: ADC/SBB didn't read the existing rd value

ADC and SBB are destructive to rd, so in order to actually do anything we need to read rd first before we blow it away.

def exec_ext(self, instr):
    if funct == 0x0:  # ADC
        result = self.reg[rd] + self.reg[rs] + (1 if self.flag_c else 0)
        # ^^^^^^^^^^^ Must read rd first (destructive operation)

#11: Multi precision failed due to r0 not being zero

Remember bug 3? Yup came back to haunt us didn't it. So this bug was fixed when we fixed r0.

#12: JAL checked the wrong bit

We forgot to check bit 11, yes I know in the perfect world we would be checking bit 12 but I ran out of encoding space! But either way we need to check bit 11 to know what we are supposed to do and we forgot to so JAL acted like a normal jump

def exec_jump(self, instr):
    _, target12 = decode_j_type(instr)
    is_jal = (instr >> 11) & 1  # Check instruction bit 11, not target

#13: Lucky number 13, JAL and J multiplied by 4

Remember bug5? Well I did it twice! I was trying to convert but this time it was already in nibbles so we jumped 4 places by mistake

def exec_jump(self, instr):
    if is_jal:
        target = target12 & 0x7FF  # Mask to 11 bits
        self.pc = target  # Already in nibbles, don't multiply!
    else:
        self.pc = target12  # Already in nibbles

Conclusion!

What did we learn tonight? A lot! Sign extension is harder than it sounds so make sure it's correct! Always double check your register definitions against the spec, I wrote it but I forgot to reference it at times, hoisted on my own petard. This address scaling business is tricky, we need to be consistent, nibbles and bytes and instructions, all different all important. We learned to always double check our destructive implementations. Finally, bit fields are hard too! But we learned didn't we and that is the whole point. The ISS is now passing all my hand-written test programs, which means the ISA is solid enough to start writing Verilog.

Next Up:

I'll be writing more test programs to stress-test the ISA—things like multi-precision multiplication, recursive function calls, and maybe a tiny Fibonacci calculator. Stay tuned!

Subscribe to King Applied Research

Sign up now to get access to the library of members-only issues.
Jamie Larson
Subscribe