From 220baad31ccec33cc1dffc9d937ad2f2b7049211 Mon Sep 17 00:00:00 2001 From: FritzFlorian Date: Wed, 3 Jul 2019 15:36:33 +0200 Subject: [PATCH] Add notes on plan for our dataflow implementation --- NOTES.md | 106 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+) diff --git a/NOTES.md b/NOTES.md index 1a70ff5..dc12969 100644 --- a/NOTES.md +++ b/NOTES.md @@ -4,6 +4,112 @@ A collection of stuff that we noticed during development. Useful later on two write a project report and to go back in time to find out why certain decisions where made. +## 03.07.2019 - Outline/Plan for our Dataflow API + +The following describes our ideas for what we expect/want to build +in our dataflow API. The restrictions and decisions try to hold a +balance for usability, feasibility of the implementation and +feature richness (what kind of flows can even be represented using +the API). + +We will model our dataflow closely to the EMBB implementation. +There are buffers that are controlled by a color/clock and +a DF graph has sources (inputs) and sinks (outputs). + +Some notable properties we want (and that can differ from EMBB): +- A DF graph must be well behaved, i.e. for each full set of input +values exactly one set of output values is produced (each parallel +invocation therefore corresponds to exactly one value in the in- +and outputs) +- A DF graph must be declared with all it's interface, i.e. it's +full set of input types and output types must be known +when declaring it +```c++ +dataflow, Outputs, NUM_PARAL> df_instance; +``` +- Nodes inside the DF graph are produced by the graph itself, allowing +it to propagate parallelism constraints (for example directly creating +the children with correct buffer capacities). This is done by having +factory functions on a concrete DF graph instance. +- A DF graph's sources/sinks are 'packed', i.e. the user must +provide a full sets of inputs to trigger the DF graph and is provided +a full set ouf outputs when reading a result from the DF Graph +(this highlights our intend for well behaved flow graphs, as users +do not even get the notion that it is ok to only return partial results) +```c++ +auto source = df.source([](String &out1, Int &out2) { + if (elements_avaliable) { + out1 = ...; + out2 = ...; + return true; + } else { + return false; + } +}); + +... + +auto sink = df.sink([](const &Image, const &Int){ + ...; +}); +``` +- Easy API's for working with array data are provided in form of +interator sources/sinks +```c++ +auto source = df.iterator_source(); +auto sink = df.iterator_sink(); + +... + +source.reset(input.begin(), input.end()); +sink.reset(output.begin()); +df.wait_for_all(); +``` +- In the first version nodes are always fully parallel, +further versions might include the per node property of +unordered_serial (nodes are executed at most once, but not ordered, +e.g. for accessing shared memory) or ordered_serial (e.g. +for logging something in correct order to a file/console). +- Sinks/User accessed outputs for a DF graph are always ordered_serial, +preventing confusion for the end user and ensuring deterministic +execution +- Memory management for all buffers and the DF graph itself is made +explicitly visible to the end user by enforcing him to hold all +components of the graph in memory when using it. This keeps on with +our phylosophy of not having hidden memory allocations, making +development for e.g. embedded platforms simpler, as it is clear +where and what resources are used (one can simply sizeof(...) all +parts of the graph and find out how much memory the buffers and so on +require) +- This model in principle allows recursive invocation. We will not +implement this in the first place, but keep the option for later. +This will potentially allow different patterns, like stencil operations, +to be implemented with the system. + +## 03.07.2019 - Dataflow in EMBB + +EMBB's dataflow is a very simple but effective implementation +of k-colored (maximum of k concurrent invocations, data/tokens +marked by an individual color per parallel invocation). +They force a acyclic, recursion-free flow and force to set source +nodes and sink nodes explicitly (acting as in- and outputs). + +This allows them to send signals down on arcs even if there is +no value, e.g. if there is a split in control flow the 'not used' +side of the flow will be fed an 'empty' token, signaling sinks +that the execution of this parallel instance reached the sink. +Once one color of tokens (so one parallel execution instance) +reaches ALL sinks the model allows a new to be input. +This force of all tokens reaching the sinks before new ones can +entry is ordered, thus potentially limiting concurrency, +but at the same time makes for a very simple to implement model. +Computational nodes between sources and sinks are associated +with input buffers (having the same capacity as the number of +parallel invocations allowed). These can hold values from +predecessors until all inputs for the node are ready. The node +is started as a process as soon as the last needed input is provided +(this event is triggered by a reference counter of missing inputs). + ## 26.06.2019 - Notes on Dataflow Implementation ### Dataflow in general -- libgit2 0.26.0