My first project involving the Virtual Boy was actually attempting to port Virtual Boy Wario Land to the Game Boy Advance. I chose this project because I wanted to learn about both the GBA and the VB, and this seemed like a good way to do both, since I’d need to understand the VB in order to reverse engineer the game.

To avoid having to distribute any of Nintendo’s IP, the game is actually a (rather heavy) ROM hack for the VB ROM that adds the GBA compatibility layer. In other words, applying the ROM hack to the VB ROM turns it into a GBA ROM!

For the most part, my code therefore treats the VB ROM as a bag of assets which are decoded on-the-fly by the GBA CPU while the game is running. All of the game logic, however, is reimplemented and compiled to ARM instructions since it would be way too slow to emulate the Virtual Boy’s V810 CPU.

Game Boy Advance vs Virtual Boy

I won’t go into too much detail here since technical literature about these systems can be found everywhere, but briefly, here are the important differences between the capabilities of these two systems:

VB GBA
CPU 20Mhz NEC 16.7Mhz ARM
RAM 64K 128K
Resolution 384x224 (per eye) 240x160
Colors 4 32768
Refresh Rate 50hz 60hz
BG Layers Up to 32 Up to 4
Max Sprites 1024 128
Max Sprite Size 8x8 64x64

They actually trade blows more evenly than you’d expect from 2 devices released 6 years apart. GBA’s CPU has a lower clock rate, but many operations require fewer cycles. Worth noting as well is that (unlike the NEC V810 included in the VB) the GBA’s CPU doesn’t support integer division or floating-point instructions — but luckily, Nintendo didn’t use those in Wario Land.

The GBA’s resolution is much lower, so our view is going to be cropped. To deal with that, the game allows you to move the window independently of Wario.

The refresh rate is problematic since this game (as is the case with most console games at the time) ties the game loop to the refresh rate directly. Running the game at 60hz will make it run too fast. Originally, I skipped updates every 6th frame, but that was noticeably juttery. To smooth this out, the physics engine will halve the motion vectors every third frame which looks much better but it more computationally expensive.

The number of background layers was also problematic, since the GBA can only scroll 4 layers separately. The main area of the first level, for example, uses 7 background layers, including the HUD. Fortunately, there are only 2 speeds at which backgrounds scroll (multiple layers therefore scroll together). To implement this with the limited number of backgrounds supported on the GBA, I ended up “baking” multiple background maps into a single layer in software on-the-fly and displaying the result as a single background. Because there is no blending between tiles (either the foreground or background tile is used) there is some artifacting from this approach, but it’s almost unnoticeable (at least on the levels I experimented with).

Artifacting from combining 2 background maps

Artifacting from combining 2 background maps

Finally, the GBA can display fewer, larger sprites, and the VB can display more (but limited to smaller) sprites. Enemies and other dynamic objects are made of up many 8x8 sprites that move together, so at first, it seems like we could just promote these to larger sprites as supported by the GBA, but unfortunately, these 8x8 sprites are not grid-aligned, making that impossible (see the section on sprites below).

Levels

Second section of the first stage

Second section of the first stage

My first order of business was to see if I could figure out at what ROM offset levels are stored, and what format they are in. This ancient blog post (written a still-active VB hacker) provided some clues, but much of what needed to be done was to set breakpoints in Mednafen that triggered whenever background map data was written to, then step through the disassembly to figure out where the data for the tilemap came from. It turns out that there are two sources for the tile map data: ROM and RAM. But why?

Backgrounds that originate directly from ROM, it turns out, are background layers that are static. Wario and enemies can’t interact with these tiles and so they never change. Tilemaps originating in RAM are copied from an RLE-compressed format from ROM when the level loads. This is the layer that contains blocks that Wario can destroy or open, and there are also hidden tiles that specify spawn points for enemies and other dynamic objects. Because this layer can change (opened blocks must remain opened, off-screen enemies need to have their positions remembered), it makes sense that this needs to be in writable RAM rather than ROM, and RLE-compression is a good option for saving data with these layers, since they are mostly empty space through which Wario can move.

Having no experience with classic game development I found this implementation novel, but I’ve since learned that this is somewhat common practice is side-scrollers like this. In-fact, Super Mario Brothers 3 for the NES has 8K of SRAM on the cartridge board specifically to hold mutable, uncompressed level data (I guess the 2K built into the NES wouldn’t cut it).

Something unique about this game is that many areas actually have 2 “dynamic” tilemaps that must be copied into RAM, since this game has a mechanic where Wario can traverse into the background layer and interact with it.

This game has depth

This game has depth

Unfortunately, there isn’t any “master level database” of any kind that would allow me to load any level generically. Instead, each level has an entry point in the form of a bespoke function that gets called whenever the level is entered, and this function does any one-off initialization, including decompressing the dynamic tilemaps, setting pointers to static tilemaps, and copying any sprite bitmap data into VRAM. Therefore, for every level I want to support, I must load into each section on Mednafen and peek at the memory see what tilemaps are being loaded and hardcode those values into the GBA port’s code.

Using the knowledge gathered above, I was able to throw together a small C++ program that, given a few ROM offsets, could display a section of level:

 

I then simply factored out the bits of the code used for decoding level data and used those modules in the GBA code.

Sprites

Not actually harmless

Not actually harmless

As mentioned before, sprites (called “objects”) on the VB are limited to 8x8 3-color (plus transparency) bitmaps. This is more limiting even than the NES or the Game Boy, both of which supported 8x16 sprites. However, it can draw up to 1024 of them per frame. The idea, therefore, is to compose larger virtual sprites by putting these 8x8 sprites adjacent to each other on screen. Given that, I was expecting to find something like this laying around in VRAM:

Grid-aligned spritesheet

Grid-aligned spritesheet

However, what we actually get is this:

Actual spritesheet

Actual spritesheet

Additionally, those bitmaps never change even though this character has many frames of animation!

Animations for this character

Animations for this character

So how does this work?

It turns out that each piece of the character is being animated individually rather than the entire character being animated as a monolith. So, each arm is a logical sprite, each horn, each leg, etc. The reason for this is to give a sense of depth to the character by slightly adjusting the horizontal offsets of each part depending on which eye they are being viewed from, giving the illusion of the character being 3D when viewed using the Virtual Boy’s stereoscopic display.

Alternating between left and right viewpoints

Alternating between left and right viewpoints

This approach is more sophisticated than the naive approach where the entire sprite is just split into grid-aligned chunks, but saves VRAM since you can animate limbs separately, and you can fake the parallax effect without having to load 2 copies of the character from different angles.

Unfortunately, figuring out how this works looked like it’s going to be a pain… until I found that someone (going by the hacker alias “Racoon Sam”) had already tried to figure it out! I found this forum post that goes into excruciating detail on how they were able to hand-decode some sprites by reverse-engineering the format (actually, the format is the native VB sprite placement format but they didn’t realize that at the time). I reached out to this user and they explained how they were able to make some progress with decoding more sprites, and provided me with memory offsets to get me started on my own implementation.

This allowed me to create a tool that would scan the ROM for what looked like animated sprite data (by pattern matching bytes and scanning the entire ROM) and attempt to display it:

 

Once I had this up and running, I was able to fold that code into the GBA compatibility layer.

One caveat was that there isn’t a mapping between these sprite sheets and the character (aka bitmap) data that should be loaded into VRAM. Is therefore had to display a sprite, then try different tile banks I had stumbled across until it looked recognizable, and store that mapping as part of the level initialization code (recall that level initialization is actually just an arbitrary function in the VB ROM). When a level (or actually, section of a level) loads, a bunch of tile banks are swapped into character RAM which supports the level tiles as well as any sprites.

Character RAM, color-coded by bank

Character RAM, color-coded by bank

Another important bit of info I learned was that some tile banks are compressed using RLE (using an algorithm that is for some reason incompatible with the aforementioned level compression RLE), while others use Huffman encoding, suggesting that Nintendo had to do some work to get this game to fit within the 2MB limit cartridge (only 1.4% of the available space appears to be unused).

Wario

Our protagonist, made of 32 overlapping 8x8 sprites

Our protagonist, made of 32 overlapping 8x8 sprites

While the set of bitmap characters loaded for each section of a level are static across all terrain and NPCs, Wario himself is special-cased due simply to how high resolution his sprite sheet is and how many animation frames there are.

More than 10% of available work RAM is reserved for storing Wario’s currently-loaded tile sets. It’s arranged as though this is VRAM character memory, suggesting that perhaps early on in development Nintendo expected to have enough VRAM to keep all of Wario’s tiles loaded but had to page them out later due to lack of space.

These banks can be loaded at any point, such as when Wario puts on a new hat or gets small. All of these tilesets are always RLE-encoded rather than Huffman — likely because that’s fast enough to do without skipping frames whenever Wario puts on a new hat.

The first 32 entries in character RAM are dedicated to Wario, and each frame, these are dynamically swapped in from the section of WRAM mentioned above to match whichever frame is being displayed. Similarly to how other characters are assembled, each part of Wario (arms, legs, feet, etc.) are animated independently and offset horizontally depending on which viewpoint (left or right) is being rendered to give that sense of depth.

Character 0 is extra special: Wario’s eye has its own independent set of animation frames that shows him looking around and blinking irrespective of what the rest of his body is doing.

Like with the other sprites, I had to write a tool to scan the ROM for Wario’s animations and match those up with the character banks.

Sound and Music

The last major piece of this puzzle was to get sound working. The VB has a 6-channel synthesizer that allows you to define your own wave forms, but these wave buffers are so short (just 32 samples) that you are going to effectively be tied to your standard set of wave shapes.

Conversely, the GBA supports 32Khz PCM sound.

The most obvious approach to getting sound working would be to capture PCM samples of each sound effect, as well as the music, and then play those back during the game. However,

  • This would result in an extremely large amount of data
  • This violates the plan to ship without any Nintendo IP with the ROM hack
  • The only option, therefore, was to build an emulator for the VB’s sound chip (called the VSU — Virtual Sound Unit) and then figure out how to read the sound data directly from the ROM.

Building a VSU emulator was fairly straightforward thanks to the Virtual Boy Sacred Tech Scroll. Making it performant enough to work on the GBA was a bit trickier, but not impossible.

Reverse engineering the sound format used in VBWL was the most time-consuming part of this process due to how sophisticated the sound system is.

Each instrument being played could be thought of as being a different song, and each such “song” is encoded as a program with a special instruction set that includes operations such as:

  • Update VSU register
  • Wait N frames before taking next action
  • Change pitch
  • Adjust pitch by N every frame
  • Adjust pitch according to table located at pointer X (used for vibrato)
  • Set start of section
  • Return to previously set start of section and replay that section N times
  • Adjust volume
  • Adjust envelope
  • Adjust left or right volume
  • Pass self to function located at given pointer to do something arbitrary to sound channel (a pain to implement)
  • Set priority

All of these functions were reverse engineered by painfully stepping through Mednafen’s debugger and translating it to C. The code that runs still has several goto statements where I couldn’t otherwise find structured control flow mechanisms to get it to do what the V810 assembly does… it’s a bit painful to read.

These “song programs” are applied to a set of shadow VSU registers located in WRAM at address 0x5000100. This is because many times the current VSU register state needs to be read, but the VSU registers themselves are write-only. Whenever a value is changed in the shadow, there is a flag that is passed to write-through to the actual VSU hardware, which is only done if the priority of the change exceeds the value of whatever is currently playing on that channel.

All SFX and songs are conveniently stored contiguously in a few pointer tables, which made discovery of this data much easier than the graphics data. Each sound effect also has two versions: a low priority one and a high priority one. Depending on how important the sound was, it could interrupt other sounds that were currently playing.

In order to test my “Wario Land” song decoding separately from by VSU emulator, I ended up porting that part of the code to a VB ROM and playing the music back on the device until it sounded right.

Putting it all together

 

See the above video, along with the debug windows, for a technical view of the end result. Notice how the character memory changes when traversing between sections, and also notice the 4 rows of tiles that change every frame due to Wario’s tiles being swapped in.

   

Discussion