ARM7 task startup gotchas

I’ve just managed to get a basic form of preemptive task switching going on NxOS Marvin. And I stumbled on something for about an hour that sounds completely obvious to me now, so I thought I might share the wisdom, just in case I’m not the only one this would have surprised in the wild.

Marvin implements task switching using a fairly simple mechanism: when an interrupt occurs, the generic IRQ handler code determines whether the interrupted control flow was in User/System mode. If it wasn’t, it assumes that it interrupted Supervisor mode, and saves the interrupted state in part on the supervisor stack, and in part on the IRQ mode stack (only the interrupted PC and CPSR).

However, if the interrupted control flow was User/System mode, the IRQ handler assumes that something, presumably a scheduler, started executing “userland” tasks there, and would like the state to be saved in a slightly more orderly fashion. To do that, the handler switches to System mode, and saves the interrupted control flow state entirely on the System mode stack. In other words, the tasks’s state is saved in its own stack.

After a little twiddling to retrieve the PC and CPSR that were sitting at the bottom of the IRQ stack, the IRQ handler hands off control to whatever handler is registered for the IRQ it received. When the handler returns, the generic code does the same operations backwards, restoring the state either from Supervisor+IRQ mode stacks, or from the System mode stack, depending on what it did earlier.

The trick here, of course, is that one of the IRQ handlers is the system timer, and that handler calls a scheduler callback, if one is registered. And if this scheduler changes the System mode stack pointer before returning, the generic handler code gracefully restores the state in the System stack just the same… But it’s the state for a different task! And hey presto, you have way to do task switching by swapping a single register!

There’s nothing really magic about any of that. Once you have the concept, it’s a couple of hours of nasty debugging to get the IRQ handler routine right, but that’s about it. Although, the description I give above kind of assumes that we already have a bunch of preempted tasks, and we just swap them in and out as we want to schedule them. But what about creating new tasks? How does one go about injecting a new task into this system?

Again, if you think about it a little, a very simple solution quickly comes to mind. You don’t need to give task creation any special treatment at all: you just need to create a fresh new stack for the new task, and prepopulate it with a state dump, in exactly the order that the IRQ handler expects. Then, you inject this newly crafted task stack into your scheduler, and sooner or later, it’ll get set as the system stack, and its state will be “restored” for the first time, and the task will fire up.

So, easy peezy. All we need to do, if we assume that tasks take no arguments, is to set two registers: PC, the instruction pointer, and CPSR, the processor mode register (which contains things like the state of arithmetic flags, whether IRQs are disabled…). If we set the PC to the address of the function we want to run, and the CPSR to the processor state we want the task to run in, the stack should be “resumable” by the IRQ handler code.

The thing that tripped me up is that the functions I’m trying to start are compiled as 16 bit Thumb opcodes, not 32 bit ARM opcodes. Arm’s standards say that, to distinguish between the two, a Thumb address is always odd (ie. the least significant bit is set). That way, the bx instruction can determine whether a mode switch is needed. If the target is Thumb, the cpu gets switched to Thumb mode, and execution jumps to the floored address. That is, even though the address is odd, execution begins at (address - 1), the same address as an Arm function.

All this sounds pretty easy, until you realize that when you are “resuming” a new task from a fabricated stack, you cannot rely on the cpu magically switching modes for you. The instruction you use to return to the original control flow at the end of execution merely puts the PC and CPSR as you provide them back into the CPU.

So this means that when, during my first attempts, I tried to start a task by providing an odd-address PC and a CPSR that just specified the CPU mode, but left out the fact that the opcodes are Thumb, thinking that the CPU would just do the right thing on resuming, the CPU helpfully jumped to an odd address, which is an alignment error, which in turn causes a prefetch abort, because the memory controller cannot read instructions that aren’t properly aligned. And of course, since our debugging facilities are so numerous (cough cough) on the NXT, the only symptoms I had were a rythmic ‘click’ signalling that the coprocessor had lost contact with the main CPU. Not surprising, it was stuck in the infinite loop we currently call “abort handler”.

So, if you want to create new tasks and resume them by exploiting the IRQ mechanism, and you have both ARM and Thumb code lying around, you need something like the following:

#define MODE_SYS 0x1F
#define MODE_THUMB 0x20

task_stack->pc = (my_func & 0xFFFFFFFE);
task_stack->cpsr = MODE_SYS;
if (my_func & 0x1)
  task_stack->cpsr |= MODE_THUMB;

Looking at it now, the problem I had looks entirely obvious, but don’t all problems look that way in hindsight?