Materializing heisenbugs and education
October 30, 2007

After a couple of days spent enhancing and debugging NxOS, I feel like telling the world what I’ve been up to.
The first big change to the NxOS project is a change of direction for the project. When we started last February, our main goal was to build a kernel that would provide a preemptively multitasked environment for programming robots in C. In a sense, we wanted to be the BrickOS of the NXT.
A couple of things made me want to change that goal. First of all, right now there are huge amounts of code duplication among NXT projects. Lejos has mostly led the way of figuring out how the brick works and getting initial support. What they call the “platform” code (the code that boots the JVM and keeps it running) has been reused in various other projects, such as Lejos OSEK. It is also a reference implementation for us in NxOS, since it was written by rather smart people, and the code generally gets the job done. However, Lejos is not currently ready to fork out their platform code into a platform independent base reusable by many, so the mechanism of reuse today is “fork it”. This bothers me, for the quality of everyone’s kernels.
Second, I stumbled upon the Transterpreter project about a month ago. The Transterpreter is a virtual machine that emulates a transputer, a general purpose parallel processor produced in the 80’s. The transterpreter project aims to use the transputer architecture, and specifically its reliance on CSP algebra, to teach parallel computing to students. The objective is to show them that concurrency is not necessarily hard nor evil (which is what they learn from “Object Oriented Programming in Java” courses), given the right tools and ways of thinking about parallelism.
I like the objectives and philosophy of this project. Anything that pull software engineering education away from “Learn Java in 5 years” feels like a win to me, but in this case especially, since the transterpreter is geared towards running on embedded systems. And by embedded systems I mean robots. And teaching concurrent programming by making students build and program robots is awesomeness squared.
Anyway, so I like the project. And lo and behold, they had no port to the Mindstorms NXT brick.
I say ‘had’, because a few weeks ago, I got together with transterpreter hackers Matt and Christian, and we built a working port. It is still very basic in various ways (almost no interfaces from the virtual machine to the hardware, no dynamic memory allocation support…), but it works well enough to have a couple of occam2 processes cooperating to produce a sawtooth beeping sound.
Because of both of these, I wanted to try to change directions a little. Now, our goal is to address two different audiences. On the one hand, we want to cater to the robot builders who want to program in C in a preemptive multitasked environment, same as before. But on the other hand, we also want to cater to the lower level guys, those who have something like the transterpreter, but don’t want to go through the pain of bootstrapping the NXT and implementing all the device drivers.
Yesterday, I reorganized our code to match this new goal. We now have two mostly distinct projects, which I’ll now detail. These are the Baseplate and Marvin. As an aside, the only name we are fully agreed upon for now is “NxOS” to refer to the whole project. The names of the subprojects are still in flux while we look for something cool. Ideas welcome! We’d like names that are in some way evocative of the goal/function in life of the components, not just random codenames pulled out of thin air. Anyway, on to the tour!
On the one hand, we have the Baseplate, which consists of the absolute minimum bootstrapping code to fire up the brick, initialize all peripherals, and run a main(). We do not implement main(), that is left up to the kernel designer. The baseplate is mostly the bare minimum you can have to be able to control all of the hardware. There are few luxuries beyond that: there is no scheduler, and although we do provide the TLSR real time memory allocator, its use is completely optional, and the code will be dropped from the final firmware image if you don’t need dynamic memory allocation.
It is my hope/dream that, in time, this Baseplate will become the preferred way to bootstrap an operating system onto the NXT, so that people can stop having to figure out yet again how the coprocessor communication link works, or what magic values need to be plugged into the display controller chip to correctly drive the LCD display. Their project is interesting because of what they do with these facilities, so they shouldn’t have to be reimplementing them after someone has gone through the trouble of figuring them out.
The second component of NxOS is an implementation of the Baseplate’s main() (which we call an “application kernel”, as opposed to the bootstrap kernel provided by the baseplate). It is called Marvin, mostly because I got tired of finding a name that was suitably cool yet informative, and just went with a reference to both Marvin the paranoid android, and Merlin in Disney’s The Sword in the Stone, whose name is usually mispronounced as Marvin by Sir Ector. It provides (or will provide, at this stage) a preemptive multitasking scheduler, facilities for uploading “userspace” C programs to run (both on the brick and on the computer - linker scripts, IDE integration and the like).
But Marvin will, in this architecture, be simply one of many things running thanks to the Baseplate. Right now, we have three application kernels in various stages of development: Marvin; a test kernel that exercises the baseplate; and the TVM kernel that runs a Transterpreter. It’s my hope that I’ll also be able to port a NBC (Nxt Byte Code, the virtual machine that “regular” lego mindstorms programs are compiled to) machine to run on the Baseplate, and maybe even pbLua, pbForth, and why not Lejos? I think that all of these projects have their own distinctive features, and that none of those features are related to the low level bit pushing of crap around in the hardware. So why not give them a chance to stop worrying about it, and just focus on their own projects to build high level kernels? In a way, I guess that I’d like NxOS to be a little like the Borg, assimilating NXT kernels into collectively using the Baseplate as their core. But without all that ugly implanting of stuff in eyes and trying to take over the universe. We’re not like that.
Currently, our baseplate is quite honestly lacking. Lejos’ platform code supports more hardware, partially through an early start, and partially because they are many and we are between 1 and 3, depending on the period. So we need to catch up a lot before we can aspire to being a comprehensive base for all these projects. And I suspect that many people will wonder what the point is, and might even be pretty hostile to the idea of slaving themselves to our code, partially because it involves extra work, and partially because of various manifestations of NIH. I find that to be a pity: the NXT kernel hacker community is fairly small (I’d estimate under 20 people in open source, slightly more if you include folks like RobotC), and I think that if we all united around one codebase for the core platform support, we could get a much more cohesive community and have fun together, instead of being partitionned off in our personal projects.
Hey, look, I’m rambling. That is my hope for the future of NxOS. I guess it’s really delusional of me to think that NxOS will ever be a uniting force to anyone but its creators, but hey, that’s what I’d like. If it doesn’t happen, I still have tremendous amounts of fun building kernels, and architecting it all to support the assimilation concept. If I’m the only one to believe that unification is worthwhile, so be it. When the time comes, I may start advocating. In the meantime, I’m having fun building an embedded system, learning very valuable stuff along the way, and finding ways to reference the Borg on my blog. That’s good enough for now.
Oh, I was supposed to rant about bugs I encountered and solved these past few days, and how what I learned compares to the “standard” education I’m getting at university. But I got carried away, and I’m itching to get back to hacking NxOS now. So it’ll be a story for another time.
Peace, out.