diff --git a/BANANAPI.md b/BANANAPI.md new file mode 100644 index 0000000..3a76d28 --- /dev/null +++ b/BANANAPI.md @@ -0,0 +1,129 @@ +# Setup BananaPI for benchmarking + +The goal of this documentation is to get a linux image running on a +bananaPI board that allows for very isolated benchmarks showing full +time distributions of the measurement runs. + +## Base Setup + +First step is to get a linux image running on the banana PI. Armbian worked very well for us, +follow these setups to get it up and running: + +- Download the mainline base kernel from armbian: https://www.armbian.com/bananapi-m3/#kernels-archive-all +- Unbpack the download +- Prepare a micro SD card for flashing +- Use etcher (https://www.balena.io/etcher/) or similar to burn the image on the sd card +- Insert the micro SD card into the banana PI and power it on +- Follow the instructions to setup an user account (best done using HDMI out and keyboard attatched) +- Test network connection/ssh login (armbian from our experience just works, that why we chose it) + +## Tweaking Scheduler, CPU and Interrupts + +We would like to get very little dispersion through system jitter. We recommend tweaking the +scheduler, CPU and interrupt settings before running benchmarks. + +See the sub-sections below for the individual measures. ***Before running tests make sure to +run the following scripts:*** +- `sudo ./setup_cpu.sh` +- `sudo ./map_interrupts_core_0.sh` +- `sudo ./setup_rt.sh` + +Then start your tests manually mapped to cores 1 to 7. + +### Pin all other processes to core 0 + +To further reduce inference with our controlled benchmark environment we map all non related +processes to core 0 of the system, running our benchmarks on cores 1 to 7. + +The system uses system.d, which makes this the simplest point to change the default process affinity. +Edit the file `/etc/systemd/system.conf` and set `CPUAffinity=0`. This will make all processes forked +from system.d run on core 0. Benchmarks can then be manually mapped to different cores. + +***BEFORE TESTS***: to make the config apply ***restart your system*** + +### CPU frequency + +Limiting the frequency to 1GHz makes sure that the banana PI dose not throttle during the tests. +Additionally, disabling any dynamic frequency scaling makes tests more reproducable. + +Create a file called 'setup_cpu.sh' and modify it with 'chmod +x setup_cpu.sh': +```shell script +echo "Writing frequency utils settings file..." +echo "ENABLE=true +MIN_SPEED=1008000 +MAX_SPEED=1008000 +GOVERNOR=performance" > vim /etc/default/cpufrequtils + +echo "Restarting frequency utils service..." + +echo "Done!" +echo "Try ./watch_cpu.sh to see if everything worked." +echo "Test your cooling by stressing the cpu and watching the temperature output." +``` + +Create a file called 'watch_cpu.sh' and modify it with 'chmod +x watch_cpu.sh': +````shell script +echo "Max Frequencies" +cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq + +echo "Actual Frequencies" +cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq + +echo "Temps.." +cat /sys/class/thermal/thermal_zone*/temp +```` + +***BEFORE TESTS***: +To setup the CPU run ***`sudo ./setup_cpu.sh`*** before your tests. To see that the change worked +and the temperatures hold stable use the `./watch_cpu.sh` script. + +### Map interrupts to core 0 + +Interrupts can infer with our benchmarks. We therefore map them to core 0 if possible and run our tests on +cores 1 to 7. + +Create a file called 'map_interrupts_core_0.sh' and modify it with 'chmod +x map_interrupts_core_0.sh': +```shell script +#!/bin/bash + +echo "Try to map interrupts to core 0." +echo "Some might fail because they can not be mapped (e.g. core specific timers)." +echo "" +echo "" + +for dir in /proc/irq/*/ +do + echo "Mapping $dir ..." + echo 1 > $dir/smp_affinity +done +``` + +***BEFORE TESTS***: map the interrupts to core 0 using ***`sudo ./map_interrupts_core_0.sh`*** + +### Full time slices to RT scheduler + +The RT scheduler in linux by default leaves some fraction of its scheduling time to non RT processes, +leaving the system in a responsive state if a RT application eats all CPU. We do not want this, as we +try to get a very predictable behavior in our RT scheduler. + +Create a file called 'setup_rt.sh' and modify it with 'chmod +x setup_rt.sh': +```shell script +sysctl -w kernel.sched_rt_runtime_us=-1 +sysctl -w kernel.sched_rt_period_us=1000000 +```` + +***BEFORE TESTS***: give full time slices to RT tasks ***`sudo ./setup_rt.sh`*** + +## Running Tests + +***Before running tests make sure to run the following scripts:*** +- `sudo ./setup_cpu.sh` +- `sudo ./map_interrupts_core_0.sh` +- `sudo ./setup_rt.sh` + +To run the tests use the following (or a similar command with different rt policy): + +`taskset FFFE chrt -rr 80 sudo -u $SUDO_USER ` + +This maps the process to all cores but core 0 and runs them using the round robin real time schedule. +Rplace -rr with --fifo to use the first in first out scheduler. diff --git a/NOTES.md b/NOTES.md index 9a7fd82..3bf294c 100644 --- a/NOTES.md +++ b/NOTES.md @@ -4,6 +4,24 @@ The new version of pls uses a more complicated/less user friendly API in favor of performance and memory guarantees. For the old version refer to the second half of this document. +# 31.03.2020 - Test setup on BananaPI + +We currently use a banana pi m3 single board computer for running our evaluations. +The reason is that we aim for clean, isolated measurements on a system that +introduces little jitter in CPU frequencies/performance with multiple equivialent cores. +It also is a rather low power board, with a TDP of about 15W, satisfying our desire for +a platform that mimmics embedded devices. On top of that it is rather cheap at under 100$. + +The main concern is software. Options are the vendor image (kernel 3.4 with RT patch), +an [arch image from the forum](http://forum.banana-pi.org/t/bananapi-bpi-m3-new-image-archlinux-4-18-1-1-arch-2018-08-19/6544) +(kernel 4.19 with RT patch) or [armbian](https://www.armbian.com/bananapi-m3/) (kernel 5.4, NO RT patch). +We tried all three. The vendor image is simply too old to be comparable to modern changes in linux. +The arch kernel is relatively new and has an RT patch, but it requires quite a bit manual setup. +We therefore settle for the armbian project, as it offers frequent updates and it is questionable +if the rt patch has even big advantages for our test cases. + +Full setup documentation (tune for isolated performance): [BANANAPI.md](./BANANAPI.md) + # 24.03.2020 - mmap stack We added different means of allocating stacks for our coroutines.