Skip to content

msinger/dmg-sim

Repository files navigation

DMG-CPU B Game Boy Simulation

SystemVerilog code for simulating a Game Boy system with Icarus Verilog. Most of the code is generated from the netlist files in msinger/dmg-schematics.

Files in this repo

File(s) Description
./dmg_cpu_b/cells/*.sv Modules implementing all standard cells of the DMG-CPU B chip, including the RAM and ROM blocks.
./dmg_cpu_b/dmg_cpu_b.sv The DMG-CPU B chip.
./sm83/cells/*.sv Modules implementing all cells in the SM83 CPU core, including the huge decoders.
./sm83/sm83.sv The SM83 CPU core.
./dmg_cpu_b_gameboy.sv Top level module that simulates a complete Game Boy system with DMG-CPU B chip, WRAM, VRAM, LCD, audio and cartridge.
./snd_dump.sv Code for dumping the APU's sound output to a file.
./vid_dump.sv Code for dumping the PPU's video signals to a file.
./mkvid/mkimgs.c C code for extracting raw RGB frames from video signal dumps.
./mkvid/mkvid.sh Bash script for combining dumped sound output and extracted RGB frames to a video.
./boot/quickboot.s Assembly code for a boot ROM that boots in less than 0.2 seconds. Leaves the system in the same state as the original boot ROM.

Requirements

Of course, you need Icarus Verilog and GNU Make.

If you want to generate video files, you also need GCC, ImageMagick and FFmpeg.

The original boot ROM is not part of this repository. If you want to simulate the original boot ROM, you need to copy a boot ROM image into the root of this repository with the name DMG_ROM.bin. Or you can place it anywhere else and overwrite the make variable BOOTROM like BOOTROM=/path/to/bootrom.

Usage

The default make target (sim-gameboy) simulates a complete Game Boy system (dmg_cpu_b_gameboy.sv). Run

make sim-gameboy

or just

make

to start the simulation. This simulates a Game Boy without cartridge, executing the boot ROM in ./DMG_ROM.bin.

The simulation can produce any of the following files:

File(s) Description
./dmg_cpu_b_gameboy.snd The APU's sound output. All four channels mixed into one 16 bit PCM file with 65536 Hz stereo.
./dmg_cpu_b_gameboy_ch[1-4].snd One 8 bit PCM file with 65536 Hz mono for each channel. Only if CH_DUMP=y is set on make command line.
./dmg_cpu_b_gameboy.vid The PPU's video signal dump. Can be used to extract images for a video.
./dmg_cpu_b_gameboy.fst Dump of all signals in the system. Can be opened with GTKWave or any other wave viewer.

To produce a playable video file from those dumps, run

make dmg_cpu_b_gameboy.mkv

Make variables for configuration

BOOTROM=<path-to-binary>:
Specifies the boot ROM binary that is loaded into the boot ROM memory on startup. The path defaults to DMG_ROM.bin. The file needs to have a size of 256 bytes. If it is shorter, the memory will have unknown ('x) values in simulation.

ROM=<path-to-binary>:
Specifies a Game Boy ROM file that will be used as a cartridge. By default, the system will be simulated without any cartridge inserted. As of now, the simulation supports no-MBC, MBC1 and MBC5 cartridges.

SECS=<number-of-seconds>:
Specifies how many seconds will be simulated until the simulation terminates. The default is 6.0 seconds. The simulation takes about 132 minutes of real time per simulated second on a Ryzen 5 3600 (with TIMING=default and SIMPLIFIED_OAM=y).

DUMP=fst (default), DUMP=vcd, DUMP=:
By default, all internal signals are dumped in FST format. To dump in VCD, add DUMP=vcd to the make command line. DUMP= disables dumping of internal signals. When signal dumps are disabled, the simulation runs faster. So when you want to generate a video, then this may be useful.

TIMING=default (default), TIMING=nodelay:
Selects which timing model to use. TIMING=default simulates the system with (hopefully) realistic signal propagation delays. Delays are calculated based on individual transistor sizes and wire lengths. TIMING=nodelay simulates the system with 0 delays instead. Without delays, the simulation may run faster, but the exact behavior of the real device depends on glitches that emerge from "bad timing". You need to run make clean when changing this variable, because delays in Verilog are compile time constants, meaning the code needs to be recompiled.

SIMPLIFY_OAM=y (default), SIMPLIFY_OAM=:
The default (SIMPLIFY_OAM=y) greatly increases simulation speed. It simulates the SRAM blocks of the OAM with reduced complexity. With this simplification, the infamous OAM bug is not present though. With SIMPLIFY_OAM=, the simplification is disabled and the simulation can predict which RAM rows will be corrupted by the OAM bug. You need to run make clean when changing this variable, because the code needs to be recompiled.

"Quickboot" boot ROM

It takes a lot of time to simulate the Game Boy, so if you want to minimize the time that the simulation spends with running the boot ROM, you can use quickboot.bin in the boot folder. This ROM takes less than 200 milliseconds of simulation time before it enters cartridge code. It recreates the same system state that the original DMG-CPU A/B/C boot ROM creates, but with one exception: The VRAM gets zeroed, it does not fill it with the Nintendo logo. The DIV register and the PPU get precisely synchronized with the boot ROM exit so that test ROMs that test for this will pass just fine. You can rebuild the binary from source using the SM83 Binutils.

To use this boot ROM, you can either make it the default by moving it to the root of the repository and rename it to DMG_ROM.bin, or by adding the variable BOOTROM=boot/quickboot.bin to the make command line.

Example usage

Simulate 120 seconds of Zelda with quickboot ROM:

make sim-gameboy BOOTROM=boot/quickboot.bin \
                 ROM=~/GB\ roms/Legend\ of\ Zelda\,\ The\ -\ Link\'s\ Awakening\ DX.gbc \
                 SECS=120.0 \
                 DUMP=

Wait a few days (seriously!), and the run

make dmg_cpu_b_gameboy.mkv

to generate a video file of the Zelda intro.

Tests

Results of Blargg's tests:

Test Result (no delay) Result (with delay)
cgb_sound n/a n/a
cpu_instrs
dmg_sound
halt_bug
instr_timing
interrupt_time n/a n/a
mem_timing
oam_bug 🚫* 🚫*

* oam_bug tests depend on small differences in signal delays which could change with temperature. According to SameBoy some edge cases indeed behave nondeterministic on real hardware, so this will most likely never behave "correctly" in a purely digital simulation.

Note: The Blargg tests suffixed with -2 are not listed here, because they run exactly the same test code as the versions without the suffix, but with much slower boilerplate code stitched around them, so they just waste a lot of CPU time. (Running all the test ROMs through the simulation already takes over ten days! 🐌🐌🐌)

Results of Mooneye GB tests:

Test Result (no delay) Result (with delay)
acceptance/add_sp_e_timing
acceptance/bits/mem_oam
acceptance/bits/reg_f
acceptance/bits/unused_hwio-GS
acceptance/boot_div-dmg0 n/a n/a
acceptance/boot_div-dmgABCmgb
acceptance/boot_div-S n/a n/a
acceptance/boot_div2-S n/a n/a
acceptance/boot_hwio-dmg0 n/a n/a
acceptance/boot_hwio-dmgABCmgb
acceptance/boot_hwio-S n/a n/a
acceptance/boot_regs-dmg0 n/a n/a
acceptance/boot_regs-dmgABC
acceptance/boot_regs-mgb n/a n/a
acceptance/boot_regs-sgb n/a n/a
acceptance/boot_regs-sgb2 n/a n/a
acceptance/call_cc_timing
acceptance/call_cc_timing2
acceptance/call_timing
acceptance/call_timing2
acceptance/di_timing-GS
acceptance/div_timing
acceptance/ei_sequence
acceptance/ei_timing
acceptance/halt_ime0_ei
acceptance/halt_ime0_nointr_timing
acceptance/halt_ime1_timing
acceptance/halt_ime1_timing2-GS
acceptance/if_ie_registers
acceptance/instr/daa
acceptance/interrupts/ie_push
acceptance/intr_timing
acceptance/jp_cc_timing
acceptance/jp_timing
acceptance/ld_hl_sp_e_timing
acceptance/oam_dma/basic
acceptance/oam_dma/reg_read
acceptance/oam_dma/sources-dmgABCmgbS
acceptance/oam_dma_restart
acceptance/oam_dma_start
acceptance/oam_dma_timing
acceptance/pop_timing
acceptance/ppu/hblank_ly_scx_timing-GS
acceptance/ppu/intr_1_2_timing-GS
acceptance/ppu/intr_2_0_timing
acceptance/ppu/intr_2_mode0_timing
acceptance/ppu/intr_2_mode0_timing_sprites
acceptance/ppu/intr_2_mode3_timing
acceptance/ppu/intr_2_oam_ok_timing
acceptance/ppu/lcdon_timing-dmgABCmgbS
acceptance/ppu/lcdon_write_timing-GS
acceptance/ppu/stat_irq_blocking
acceptance/ppu/stat_lyc_onoff
acceptance/ppu/vblank_stat_intr-GS
acceptance/push_timing
acceptance/rapid_di_ei
acceptance/ret_cc_timing
acceptance/ret_timing
acceptance/reti_intr_timing
acceptance/reti_timing
acceptance/rst_timing
acceptance/serial/boot_sclk_align-dmgABCmgb
acceptance/timer/div_write
acceptance/timer/rapid_toggle
acceptance/timer/tim00
acceptance/timer/tim00_div_trigger
acceptance/timer/tim01
acceptance/timer/tim01_div_trigger
acceptance/timer/tim10
acceptance/timer/tim10_div_trigger
acceptance/timer/tim11
acceptance/timer/tim11_div_trigger
acceptance/timer/tima_reload
acceptance/timer/tima_write_reloading
acceptance/timer/tma_write_reloading
madness/mgb_oam_dma_halt_sprites 🚫* 🚫*
manual-only/sprite_priority

* madness/mgb_oam_dma_halt_sprites test behaves nondeterministic on real hardware, it shows a different picture on individual DMG-CPU B devices, so there is no "correct" result we could test for. But I think it will be worth looking into how this works anyway.