Name Last Update
app Loading commit data...
ci_scripts Loading commit data...
cmake Loading commit data...
extern Loading commit data...
lib Loading commit data...
media Loading commit data...
test Loading commit data...
.gitignore Loading commit data...
.gitlab-ci.yml Loading commit data...
BANANAPI.md Loading commit data...
CMakeLists.txt Loading commit data...
NOTES.md Loading commit data...
PERFORMANCE-v1.md Loading commit data...
PERFORMANCE-v2.md Loading commit data...
README.md Loading commit data...
compare_benchmarks.py Loading commit data...

Predictable Parallel Patterns Library for Scalable Smart Systems

pipeline status

Getting Started

This section will give a brief introduction on how to get a minimal project setup that uses the PLS library. Further general notes and performance notes can be found in their respective files.

Further notes on performance and general notes on the development progress can be found in the linked documents.

Installation

Clone the repository and open a terminal session in its folder. Create a build folder using mkdir cmake-build-release and switch into it cd cmake-build-release. Setup the cmake project using cmake ../ -DCMAKE_BUILD_TYPE=RELEASE, then install it as a system wide dependency using sudo make install.pls.

At this point the library is installed on your system. To use it simply add it to your existing cmake project using find_package(pls REQUIRED) and then link it to your project using target_link_libraries(your_target pls::pls).

Basic Usage

#include <pls/pls.h>
#include <iostream>

long fib(long n);

int main() {
    // All memory needed by the scheduler can be allocated in advance either on stack or using malloc.
    const unsigned int num_threads = 8;
    const unsigned int memory_per_thread = 2 << 14;
    static pls::static_scheduler_memory<num_threads, memory_per_thread> memory;

    // Create the scheduler instance (starts a thread pool).
    pls::scheduler scheduler{&memory, num_threads};

    // Wake up the thread pool and perform work.
    scheduler.perform_work([&] {
        long result = fib(20);
        std::cout << "fib(20)=" << result << std::endl;
    });
    // At this point the thread pool sleeps.
    // This can for example be used for periodic work.
}

long fib(long n) {
    if (n == 0) {
        return 0;
    }
    if (n == 1) {
        return 1;
    }

    // Example for the high level API.
    // Will run both functions in parallel as seperate tasks.
    int left, right;
    pls::invoke_parallel(
            [&] { left = fib(n - 1); },
            [&] { right = fib(n - 2); }
    );
    return left + right;
}

Project Structure

The project uses CMAKE as it's build system, the recommended IDE is either a simple text editor or CLion. We divide the project into sub-targets to separate for the library itself, testing and example code. The library itself can be found in lib/pls, testing related code is in test, example and playground apps are in app.

Buiding

To build the project first create a folder for the build (typically as a subfolder to the project) using mkdir cmake-build-debug. Change to the new folder cd cmake-build-debug and init the cmake project using cmake ../ -DCMAKE_BUILD_TYPE=DEBUG. For realease builds do the same only with build type RELEASE. Other build time settings can also be passed at this setup step.

After this is done you can use normal make commands like make to build everything make <target> to build a target or make install to install the library globally.

Available Settings:

  • -DEASY_PROFILER=ON/OFF
    • default OFF
    • Enabling will link the easy profiler library and enable its macros
    • Enabling has a performance hit (do not use in releases)
  • -DADDRESS_SANITIZER=ON/OFF
    • default OFF
    • Enables address sanitizer to be linked to the executable
    • Only one sanitizer can be active at once
    • Enabling has a performance hit (do not use in releases)
  • -DTHREAD_SANITIZER=ON/OFF
    • default OFF
    • Enables thread/datarace sanitizer to be linked to the executable
    • Only one sanitizer can be active at once
    • Enabling has a performance hit (do not use in releases)
  • -DDEBUG_SYMBOLS=ON/OFF
    • default OFF
    • Enables the build with debug symbols
    • Use for e.g. profiling the release build

Note that these settings are persistent for one CMake build folder. If you e.g. set a flag in the debug build it will not influence the release build, but it will persist in the debug build folder until you explicitly change it back.

Testing

Testing is done using Catch2 in the test subfolder. Tests are build into a target called tests and can be executed simply by building this executabe and running it.

Data Race Detection

As this project contains a lot concurrent code we use Thread Sanitizer in our CI process and optional in other builds. To setup CMake builds with sanitizer enabled add the cmake option -DTHREAD_SANITIZER=ON. Please regularly test with thread sanitizer enabled and make sure to not keep the repository in a state where the sanitizer reports errors.

Consider reading the section on common data races to get an idea of what we try to avoid in our code.

Profiling EasyProfiler

To make profiling portable and allow us to later analyze the logs programaticly we use easy_profiler for capturing data. To enable profiling install the library on your system (best building it and then running make install) and set the cmake option -DEASY_PROFILER=ON.

After that see the invoke_parallel example app for activating the profiler. This will generate a trace file that can be viewed with the profiler_gui <output.prof> command.

Please note that the profiler adds overhead when looking at sub millisecond method invokations as we do and it can not replace a seperate profiler like gperf, valgrind or vtune amplifier for detailed analysis. We still think it makes sense to add it in as an optional feature, as the customizable colors and fine grained events (including collection of variables) can be used to visualize the big picture of program execution. Also, we hope to use it to log 'events' like successful and failed steals in the future, as the general idea of logging information per thread efficiently might be helpful for further analysis.

Profiling VTune Amplifier

For detailed profiling of small performance hotspots we prefer to use Intel's VTune Amplifier. It gives insights in detailed microachitecture usage and performance hotspots. Follow the instructions by Intel for using it. Make sure to enable debug symbols (-DDEBUG_SYMBOLS=ON) in the analyzed build and that all optimizations are turned on (by choosing the release build).