Commit 2d470f1c by FritzFlorian

Add notes on banana pi setup for tests.

parent 4e865e0e
Pipeline #1427 failed with stages
in 31 seconds
# Setup BananaPI for benchmarking
The goal of this documentation is to get a linux image running on a
bananaPI board that allows for very isolated benchmarks showing full
time distributions of the measurement runs.
## Base Setup
First step is to get a linux image running on the banana PI. Armbian worked very well for us,
follow these setups to get it up and running:
- Download the mainline base kernel from armbian: https://www.armbian.com/bananapi-m3/#kernels-archive-all
- Unbpack the download
- Prepare a micro SD card for flashing
- Use etcher (https://www.balena.io/etcher/) or similar to burn the image on the sd card
- Insert the micro SD card into the banana PI and power it on
- Follow the instructions to setup an user account (best done using HDMI out and keyboard attatched)
- Test network connection/ssh login (armbian from our experience just works, that why we chose it)
## Tweaking Scheduler, CPU and Interrupts
We would like to get very little dispersion through system jitter. We recommend tweaking the
scheduler, CPU and interrupt settings before running benchmarks.
See the sub-sections below for the individual measures. ***Before running tests make sure to
run the following scripts:***
- `sudo ./setup_cpu.sh`
- `sudo ./map_interrupts_core_0.sh`
- `sudo ./setup_rt.sh`
Then start your tests manually mapped to cores 1 to 7.
### Pin all other processes to core 0
To further reduce inference with our controlled benchmark environment we map all non related
processes to core 0 of the system, running our benchmarks on cores 1 to 7.
The system uses system.d, which makes this the simplest point to change the default process affinity.
Edit the file `/etc/systemd/system.conf` and set `CPUAffinity=0`. This will make all processes forked
from system.d run on core 0. Benchmarks can then be manually mapped to different cores.
***BEFORE TESTS***: to make the config apply ***restart your system***
### CPU frequency
Limiting the frequency to 1GHz makes sure that the banana PI dose not throttle during the tests.
Additionally, disabling any dynamic frequency scaling makes tests more reproducable.
Create a file called 'setup_cpu.sh' and modify it with 'chmod +x setup_cpu.sh':
```shell script
echo "Writing frequency utils settings file..."
echo "ENABLE=true
MIN_SPEED=1008000
MAX_SPEED=1008000
GOVERNOR=performance" > vim /etc/default/cpufrequtils
echo "Restarting frequency utils service..."
echo "Done!"
echo "Try ./watch_cpu.sh to see if everything worked."
echo "Test your cooling by stressing the cpu and watching the temperature output."
```
Create a file called 'watch_cpu.sh' and modify it with 'chmod +x watch_cpu.sh':
````shell script
echo "Max Frequencies"
cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq
echo "Actual Frequencies"
cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq
echo "Temps.."
cat /sys/class/thermal/thermal_zone*/temp
````
***BEFORE TESTS***:
To setup the CPU run ***`sudo ./setup_cpu.sh`*** before your tests. To see that the change worked
and the temperatures hold stable use the `./watch_cpu.sh` script.
### Map interrupts to core 0
Interrupts can infer with our benchmarks. We therefore map them to core 0 if possible and run our tests on
cores 1 to 7.
Create a file called 'map_interrupts_core_0.sh' and modify it with 'chmod +x map_interrupts_core_0.sh':
```shell script
#!/bin/bash
echo "Try to map interrupts to core 0."
echo "Some might fail because they can not be mapped (e.g. core specific timers)."
echo ""
echo ""
for dir in /proc/irq/*/
do
echo "Mapping $dir ..."
echo 1 > $dir/smp_affinity
done
```
***BEFORE TESTS***: map the interrupts to core 0 using ***`sudo ./map_interrupts_core_0.sh`***
### Full time slices to RT scheduler
The RT scheduler in linux by default leaves some fraction of its scheduling time to non RT processes,
leaving the system in a responsive state if a RT application eats all CPU. We do not want this, as we
try to get a very predictable behavior in our RT scheduler.
Create a file called 'setup_rt.sh' and modify it with 'chmod +x setup_rt.sh':
```shell script
sysctl -w kernel.sched_rt_runtime_us=-1
sysctl -w kernel.sched_rt_period_us=1000000
````
***BEFORE TESTS***: give full time slices to RT tasks ***`sudo ./setup_rt.sh`***
## Running Tests
***Before running tests make sure to run the following scripts:***
- `sudo ./setup_cpu.sh`
- `sudo ./map_interrupts_core_0.sh`
- `sudo ./setup_rt.sh`
To run the tests use the following (or a similar command with different rt policy):
`taskset FFFE chrt -rr 80 sudo -u $SUDO_USER <benchmark>`
This maps the process to all cores but core 0 and runs them using the round robin real time schedule.
Rplace -rr with --fifo to use the first in first out scheduler.
......@@ -4,6 +4,24 @@ The new version of pls uses a more complicated/less user friendly
API in favor of performance and memory guarantees.
For the old version refer to the second half of this document.
# 31.03.2020 - Test setup on BananaPI
We currently use a banana pi m3 single board computer for running our evaluations.
The reason is that we aim for clean, isolated measurements on a system that
introduces little jitter in CPU frequencies/performance with multiple equivialent cores.
It also is a rather low power board, with a TDP of about 15W, satisfying our desire for
a platform that mimmics embedded devices. On top of that it is rather cheap at under 100$.
The main concern is software. Options are the vendor image (kernel 3.4 with RT patch),
an [arch image from the forum](http://forum.banana-pi.org/t/bananapi-bpi-m3-new-image-archlinux-4-18-1-1-arch-2018-08-19/6544)
(kernel 4.19 with RT patch) or [armbian](https://www.armbian.com/bananapi-m3/) (kernel 5.4, NO RT patch).
We tried all three. The vendor image is simply too old to be comparable to modern changes in linux.
The arch kernel is relatively new and has an RT patch, but it requires quite a bit manual setup.
We therefore settle for the armbian project, as it offers frequent updates and it is questionable
if the rt patch has even big advantages for our test cases.
Full setup documentation (tune for isolated performance): [BANANAPI.md](./BANANAPI.md)
# 24.03.2020 - mmap stack
We added different means of allocating stacks for our coroutines.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or sign in to comment