This repository contains the replication package of the research paper "Parallel Programming Models on Microcontrollers". The following sections illustrate where the patches to the OpenMP and SYCL runtimes can be found, and how to reproduce the experiments on a Raspberry Pi Pico RP2040.
The OpenMP runtime (libgomp for GCC and libomp for LLVM) is compiled together with the compiler, so when installing the arm-miosix-eabi-gcc compiler you are also installing the correspongding runtime.
The compiler patches required to run OpenMP on microcontrollers are in the openmp-runtime-patches directory.
For what concerns SYCL, the AdaptiveCpp runtime is separate from the compiler, and is in the AdaptiveCpp-Embedded git submodule. The modifications required to adapt AdaptiveCpp to microcontrollers are in the git commit history.
To install the necessary dependencies, you can run:
sudo apt install cmake ninja-build python3.12
or an apt equivalent for non-debian linux distributions
Optionally, also install ccache to cache the miosix-llvm build
sudo apt install ccache
Install picotool https://github.com/raspberrypi/picotool
Install openocd patched for the raspberrypi https://github.com/raspberrypi/openocd
Remember to install picotool's udev rules and to add your user to plugdev and dialout groups in order to be able to call picotool and screen /dev/ttyACM0 without sudo.
Also create file /etc/udev/rules.d/60-openocd.rules containing
SUBSYSTEM=="usb", ATTRS{idVendor}=="2e8a", ATTRS{idProduct}=="000c", MODE="0666", TAG+="uaccess"
to call openocd without sudo. Reboot for the changes to take effect.
Open a shell in the root directory of this repository
Fetch the Miosix and AdaptiveCpp submodules with git submodule update --init --recursive
Access the compiler directory with cd miosix-kernel/miosix/_tools/compiler
Install the Miosix gcc compiler (either use the binary release provided as part of this repository, or build it from source using the installation scripts provided together with the kernel in the miosix-kernel/miosix/_tools/compiler directory):
sh MiosixToolchainInstaller9.2.0mp3.4.run
Compile and install the Miosix llvm compiler. The install-script is set to use ccache by default: if you didn't install it, set USE_CCACHE=0 inside the script, otherwise it will stop early.
cd llvm-18
./download.sh
./install-script.sh
Compile and install AdaptiveCpp:
cd ../../../../../AdaptiveCpp-Embedded
mkdir build
cd build
cmake -DCOMPILE_FOR_MICROCONTROLLERS=ON \
-DCMAKE_INSTALL_PREFIX=/opt/miosix-adaptivecpp \
-DCMAKE_TOOLCHAIN_FILE=../../miosix-kernel/miosix/cmake/Toolchains/clang.cmake \
-DCMAKE_C_FLAGS="-mcpu=cortex-m0plus -mthumb" \
-DCMAKE_CXX_FLAGS="-mcpu=cortex-m0plus -mthumb" ..
make -j`nproc`
sudo make install
Open a shell in the root directory of this repository and access the tests directory with cd tests.
Note that make <target>_program flashes the microcontroller. You can see the program output on the serial port to which the microcontroller is connected.
cd miosix
mkdir build
cd build
cmake ..
make -j`nproc`
make main_program
cd ..
The output should be similar to this:
Starting Kernel... Ok
Miosix v3.0devel1 (rp2040_raspberry_pi_pico, Jun 5 2025 15:53:43)
Mounting MountpointFs as / ... Ok
Mounting DevFs as /dev ... Ok
Mounting Fat32Fs as /sd ... Failed
OS Timer freq = 48000000 Hz
Available heap 259344 out of 266888 Bytes
Hello world!
cd openmp
mkdir build
cd build
cmake ..
make -j`nproc`
make main_program
cd ..
The output should be similar to this:
Starting Kernel... Ok
Miosix v3.0devel1 (rp2040_raspberry_pi_pico, Jun 5 2025 16:32:39)
Mounting MountpointFs as / ... Ok
Mounting DevFs as /dev ... Ok
Mounting Fat32Fs as /sd ... Failed
OS Timer freq = 48000000 Hz
Available heap 249096 out of 256728 Bytes
Hello from the main thread! Using 2 threads.
Iteration 0 executed by thread 0
Iteration 5 executed by thread 1
Iteration 1 executed by thread 0
Iteration 6 executed by thread 1
Iteration 2 executed by thread 0
Iteration 7 executed by thread 1
Iteration 3 executed by thread 0
Iteration 8 executed by thread 1
Iteration 4 executed by thread 0
Iteration 9 executed by thread 1
Total parallel time: 6.00904
Iteration 0 executed by thread 0
Iteration 1 executed by thread 0
Iteration 2 executed by thread 0
Iteration 3 executed by thread 0
Iteration 4 executed by thread 0
Iteration 5 executed by thread 0
Iteration 6 executed by thread 0
Iteration 7 executed by thread 0
Iteration 8 executed by thread 0
Iteration 9 executed by thread 0
Total sequential time: 10.04
cd sycl-library-only
mkdir build
cd build
cmake .. -DCMAKE_CXX_STANDARD=17
make -j`nproc`
make main_program
cd ..
The output should be similar to this:
Starting Kernel... Ok
Miosix v3.0devel1 (rp2040_raspberry_pi_pico, Jun 5 2025 16:43:12)
Mounting MountpointFs as / ... Ok
Mounting DevFs as /dev ... Ok
Mounting Fat32Fs as /sd ... Failed
OS Timer freq = 48000000 Hz
Available heap 235848 out of 255920 Bytes
Using 2 cores
Device: AdaptiveCpp OpenMP host device
[AdaptiveCpp Warning] This application uses SYCL buffers; the SYCL buffer-accessor model is well-known to introduce unnecessary overheads. Please consider migrating to the SYCL2020 USM model, in particular device USM (sycl::malloc_device) combined with in-order queues for more performance. See the AdaptiveCpp performance guide for more information:
https://github.com/AdaptiveCpp/AdaptiveCpp/blob/develop/doc/performance.md
Total time: 0.838656
C:
200 200 200 200 200
200 200 200 200 200
200 200 200 200 200
200 200 200 200 200
200 200 200 200 200
matmul successfully completed
Stack memory statistics.
Size: 16384
Used (current/max): 1792/3080
Free (current/min): 14592/13304
Heap memory statistics.
Size: 255920
Used (current/max): 202096/202672
Free (current/min): 53824/53248
Open a shell in the root directory of this repository and access the polybench directory with cd benchmarks/polybench
Connect the rp2040 to the host using a debug probe or a second rp2040 as debugger like explained in https://datasheets.raspberrypi.com/pico/getting-started-with-pico.pdf
Open a separate shell in the current directory, start openocd and leave it open
openocd -f ../../miosix-kernel/miosix/arch/cortexM0plus_rp2040/rp2040_raspberry_pi_pico/openocd.cfg
Identify the device in your filesystem and set its path inside run.py, modifying the line
device = "/dev/ttyACM0"
Install python requirements with python3.12 -m pip install -r requirements.txt
Run the benchmarks with python3.12 run.py
Observe the results in the results directory
After this process, also the additional binary bench-runner-rt will be available in directories
benchmarks/polybench/openmp/build_gccbenchmarks/polybench/openmp/build_clangbenchmarks/polybench/sycl/build_gccbenchmarks/polybench/sycl/build_clang
You can flash those additional binaries to run the benchmarks co-scheduled with two real time threads and observe that no deadlines are missed.