SystemVerilog code for simulating a Game Boy system with Icarus Verilog. Most of the code is generated from the netlist files in msinger/dmg-schematics.
| File(s) | Description |
|---|---|
| ./dmg_cpu_b/cells/*.sv | Modules implementing all standard cells of the DMG-CPU B chip, including the RAM and ROM blocks. |
| ./dmg_cpu_b/dmg_cpu_b.sv | The DMG-CPU B chip. |
| ./sm83/cells/*.sv | Modules implementing all cells in the SM83 CPU core, including the huge decoders. |
| ./sm83/sm83.sv | The SM83 CPU core. |
| ./dmg_cpu_b_gameboy.sv | Top level module that simulates a complete Game Boy system with DMG-CPU B chip, WRAM, VRAM, LCD, audio and cartridge. |
| ./snd_dump.sv | Code for dumping the APU's sound output to a file. |
| ./vid_dump.sv | Code for dumping the PPU's video signals to a file. |
| ./mkvid/mkimgs.c | C code for extracting raw RGB frames from video signal dumps. |
| ./mkvid/mkvid.sh | Bash script for combining dumped sound output and extracted RGB frames to a video. |
| ./boot/quickboot.s | Assembly code for a boot ROM that boots in less than 0.2 seconds. Leaves the system in the same state as the original boot ROM. |
Of course, you need Icarus Verilog and GNU Make.
If you want to generate video files, you also need GCC, ImageMagick and FFmpeg.
The original boot ROM is not part of this repository. If you want to simulate the original boot ROM, you need to copy
a boot ROM image into the root of this repository with the name DMG_ROM.bin. Or you can place it anywhere else and
overwrite the make variable BOOTROM like BOOTROM=/path/to/bootrom.
The default make target (sim-gameboy) simulates a complete Game Boy system (dmg_cpu_b_gameboy.sv). Run
make sim-gameboy
or just
make
to start the simulation. This simulates a Game Boy without cartridge, executing the boot ROM in ./DMG_ROM.bin.
The simulation can produce any of the following files:
| File(s) | Description |
|---|---|
| ./dmg_cpu_b_gameboy.snd | The APU's sound output. All four channels mixed into one 16 bit PCM file with 65536 Hz stereo. |
| ./dmg_cpu_b_gameboy_ch[1-4].snd | One 8 bit PCM file with 65536 Hz mono for each channel. Only if CH_DUMP=y is set on make command line. |
| ./dmg_cpu_b_gameboy.vid | The PPU's video signal dump. Can be used to extract images for a video. |
| ./dmg_cpu_b_gameboy.fst | Dump of all signals in the system. Can be opened with GTKWave or any other wave viewer. |
To produce a playable video file from those dumps, run
make dmg_cpu_b_gameboy.mkv
BOOTROM=<path-to-binary>:
Specifies the boot ROM binary that is loaded into the boot ROM memory on startup. The path defaults to DMG_ROM.bin.
The file needs to have a size of 256 bytes. If it is shorter, the memory will have unknown ('x) values in simulation.
ROM=<path-to-binary>:
Specifies a Game Boy ROM file that will be used as a cartridge. By default, the system will be simulated without any
cartridge inserted. As of now, the simulation supports no-MBC, MBC1 and MBC5 cartridges.
SECS=<number-of-seconds>:
Specifies how many seconds will be simulated until the simulation terminates. The default is 6.0 seconds.
The simulation takes about 132 minutes of real time per simulated second on a Ryzen 5 3600 (with TIMING=default
and SIMPLIFIED_OAM=y).
DUMP=fst (default), DUMP=vcd, DUMP=:
By default, all internal signals are dumped in FST format. To dump in VCD, add DUMP=vcd to the make command line. DUMP=
disables dumping of internal signals. When signal dumps are disabled, the simulation runs faster. So when you want to
generate a video, then this may be useful.
TIMING=default (default), TIMING=nodelay:
Selects which timing model to use. TIMING=default simulates the system with (hopefully) realistic signal propagation
delays. Delays are calculated based on individual transistor sizes and wire lengths. TIMING=nodelay simulates the system
with 0 delays instead. Without delays, the simulation may run faster, but the exact behavior of the real device depends
on glitches that emerge from "bad timing". You need to run make clean when changing this variable, because delays in
Verilog are compile time constants, meaning the code needs to be recompiled.
SIMPLIFY_OAM=y (default), SIMPLIFY_OAM=:
The default (SIMPLIFY_OAM=y) greatly increases simulation speed. It simulates the SRAM blocks of the OAM with reduced
complexity. With this simplification, the infamous OAM bug is not present though. With SIMPLIFY_OAM=, the simplification
is disabled and the simulation can predict which RAM rows will be corrupted by the OAM bug. You need to run make clean
when changing this variable, because the code needs to be recompiled.
It takes a lot of time to simulate the Game Boy, so if you want to minimize the time that the simulation spends with
running the boot ROM, you can use quickboot.bin in the boot folder. This ROM takes less than 200 milliseconds of
simulation time before it enters cartridge code. It recreates the same system state that the original DMG-CPU A/B/C
boot ROM creates, but with one exception: The VRAM gets zeroed, it does not fill it with the Nintendo logo. The DIV
register and the PPU get precisely synchronized with the boot ROM exit so that test ROMs that test for this will pass
just fine. You can rebuild the binary from source using the SM83 Binutils.
To use this boot ROM, you can either make it the default by moving it to the root of the repository and rename it
to DMG_ROM.bin, or by adding the variable BOOTROM=boot/quickboot.bin to the make command line.
Simulate 120 seconds of Zelda with quickboot ROM:
make sim-gameboy BOOTROM=boot/quickboot.bin \
ROM=~/GB\ roms/Legend\ of\ Zelda\,\ The\ -\ Link\'s\ Awakening\ DX.gbc \
SECS=120.0 \
DUMP=
Wait a few days (seriously!), and the run
make dmg_cpu_b_gameboy.mkv
to generate a video file of the Zelda intro.
Results of Blargg's tests:
| Test | Result (no delay) | Result (with delay) |
|---|---|---|
| cgb_sound | n/a | n/a |
| cpu_instrs | ✅ | ✅ |
| dmg_sound | ❌ | ❌ |
| halt_bug | ✅ | ✅ |
| instr_timing | ✅ | ✅ |
| interrupt_time | n/a | n/a |
| mem_timing | ✅ | ✅ |
| oam_bug | 🚫* | 🚫* |
* oam_bug tests depend on small differences in signal delays which could change with temperature. According to SameBoy some edge cases indeed behave nondeterministic on real hardware, so this will most likely never behave "correctly" in a purely digital simulation.
Note: The Blargg tests suffixed with -2 are not listed here, because they run exactly the same test code as the versions
without the suffix, but with much slower boilerplate code stitched around them, so they just waste a lot of CPU time.
(Running all the test ROMs through the simulation already takes over ten days! 🐌🐌🐌)
Results of Mooneye GB tests:
| Test | Result (no delay) | Result (with delay) |
|---|---|---|
| acceptance/add_sp_e_timing | ✅ | ✅ |
| acceptance/bits/mem_oam | ✅ | ✅ |
| acceptance/bits/reg_f | ✅ | ✅ |
| acceptance/bits/unused_hwio-GS | ✅ | ✅ |
| acceptance/boot_div-dmg0 | n/a | n/a |
| acceptance/boot_div-dmgABCmgb | ✅ | ✅ |
| acceptance/boot_div-S | n/a | n/a |
| acceptance/boot_div2-S | n/a | n/a |
| acceptance/boot_hwio-dmg0 | n/a | n/a |
| acceptance/boot_hwio-dmgABCmgb | ✅ | ✅ |
| acceptance/boot_hwio-S | n/a | n/a |
| acceptance/boot_regs-dmg0 | n/a | n/a |
| acceptance/boot_regs-dmgABC | ✅ | ✅ |
| acceptance/boot_regs-mgb | n/a | n/a |
| acceptance/boot_regs-sgb | n/a | n/a |
| acceptance/boot_regs-sgb2 | n/a | n/a |
| acceptance/call_cc_timing | ✅ | ✅ |
| acceptance/call_cc_timing2 | ✅ | ✅ |
| acceptance/call_timing | ✅ | ✅ |
| acceptance/call_timing2 | ✅ | ✅ |
| acceptance/di_timing-GS | ✅ | ✅ |
| acceptance/div_timing | ✅ | ✅ |
| acceptance/ei_sequence | ✅ | ✅ |
| acceptance/ei_timing | ✅ | ✅ |
| acceptance/halt_ime0_ei | ✅ | ✅ |
| acceptance/halt_ime0_nointr_timing | ✅ | ✅ |
| acceptance/halt_ime1_timing | ✅ | ✅ |
| acceptance/halt_ime1_timing2-GS | ✅ | ✅ |
| acceptance/if_ie_registers | ✅ | ✅ |
| acceptance/instr/daa | ✅ | ✅ |
| acceptance/interrupts/ie_push | ✅ | ✅ |
| acceptance/intr_timing | ✅ | ✅ |
| acceptance/jp_cc_timing | ✅ | ✅ |
| acceptance/jp_timing | ✅ | ✅ |
| acceptance/ld_hl_sp_e_timing | ✅ | ✅ |
| acceptance/oam_dma/basic | ✅ | ✅ |
| acceptance/oam_dma/reg_read | ✅ | ✅ |
| acceptance/oam_dma/sources-dmgABCmgbS | ✅ | ✅ |
| acceptance/oam_dma_restart | ✅ | ✅ |
| acceptance/oam_dma_start | ✅ | ✅ |
| acceptance/oam_dma_timing | ✅ | ✅ |
| acceptance/pop_timing | ✅ | ✅ |
| acceptance/ppu/hblank_ly_scx_timing-GS | ✅ | ✅ |
| acceptance/ppu/intr_1_2_timing-GS | ✅ | ✅ |
| acceptance/ppu/intr_2_0_timing | ✅ | ✅ |
| acceptance/ppu/intr_2_mode0_timing | ✅ | ✅ |
| acceptance/ppu/intr_2_mode0_timing_sprites | ✅ | ✅ |
| acceptance/ppu/intr_2_mode3_timing | ✅ | ✅ |
| acceptance/ppu/intr_2_oam_ok_timing | ✅ | ✅ |
| acceptance/ppu/lcdon_timing-dmgABCmgbS | ✅ | ✅ |
| acceptance/ppu/lcdon_write_timing-GS | ❌ | ✅ |
| acceptance/ppu/stat_irq_blocking | ✅ | ✅ |
| acceptance/ppu/stat_lyc_onoff | ✅ | ✅ |
| acceptance/ppu/vblank_stat_intr-GS | ✅ | ✅ |
| acceptance/push_timing | ✅ | ✅ |
| acceptance/rapid_di_ei | ✅ | ✅ |
| acceptance/ret_cc_timing | ✅ | ✅ |
| acceptance/ret_timing | ✅ | ✅ |
| acceptance/reti_intr_timing | ✅ | ✅ |
| acceptance/reti_timing | ✅ | ✅ |
| acceptance/rst_timing | ✅ | ✅ |
| acceptance/serial/boot_sclk_align-dmgABCmgb | ✅ | ✅ |
| acceptance/timer/div_write | ✅ | ✅ |
| acceptance/timer/rapid_toggle | ✅ | ✅ |
| acceptance/timer/tim00 | ✅ | ✅ |
| acceptance/timer/tim00_div_trigger | ✅ | ✅ |
| acceptance/timer/tim01 | ✅ | ✅ |
| acceptance/timer/tim01_div_trigger | ✅ | ✅ |
| acceptance/timer/tim10 | ✅ | ✅ |
| acceptance/timer/tim10_div_trigger | ✅ | ✅ |
| acceptance/timer/tim11 | ✅ | ✅ |
| acceptance/timer/tim11_div_trigger | ✅ | ✅ |
| acceptance/timer/tima_reload | ✅ | ✅ |
| acceptance/timer/tima_write_reloading | ✅ | ✅ |
| acceptance/timer/tma_write_reloading | ✅ | ✅ |
| madness/mgb_oam_dma_halt_sprites | 🚫* | 🚫* |
| manual-only/sprite_priority | ✅ | ✅ |
* madness/mgb_oam_dma_halt_sprites test behaves nondeterministic on real hardware, it shows a different picture on individual DMG-CPU B devices, so there is no "correct" result we could test for. But I think it will be worth looking into how this works anyway.