From 220baad31ccec33cc1dffc9d937ad2f2b7049211 Mon Sep 17 00:00:00 2001
From: FritzFlorian <flo.fritz@t-online.de>
Date: Wed, 3 Jul 2019 15:36:33 +0200
Subject: [PATCH] Add notes on plan for our dataflow implementation

---
 NOTES.md | 106 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 106 insertions(+)
diff --git a/NOTES.md b/NOTES.md
index 1a70ff5..dc12969 100644
--- a/NOTES.md
+++ b/NOTES.md
@@ -4,6 +4,112 @@ A collection of stuff that we noticed during development.
 Useful later on two write a project report and to go back
 in time to find out why certain decisions where made.
 
+## 03.07.2019 - Outline/Plan for our Dataflow API
+
+The following describes our ideas for what we expect/want to build
+in our dataflow API. The restrictions and decisions try to hold a
+balance for usability, feasibility of the implementation and
+feature richness (what kind of flows can even be represented using
+the API).
+
+We will model our dataflow closely to the EMBB implementation.
+There are buffers that are controlled by a color/clock and
+a DF graph has sources (inputs) and sinks (outputs).
+
+Some notable properties we want (and that can differ from EMBB):
+- A DF graph must be well behaved, i.e. for each full set of input
+values exactly one set of output values is produced (each parallel
+invocation therefore corresponds to exactly one value in the in-
+and outputs)
+- A DF graph must be declared with all it's interface, i.e. it's
+full set of input types and output types must be known
+when declaring it
+```c++
+dataflow<Inputs<String, Int>, Outputs<Int>, NUM_PARAL> df_instance;
+```
+- Nodes inside the DF graph are produced by the graph itself, allowing
+it to propagate parallelism constraints (for example directly creating
+the children with correct buffer capacities). This is done by having
+factory functions on a concrete DF graph instance.
+- A DF graph's sources/sinks are 'packed', i.e. the user must
+provide a full sets of inputs to trigger the DF graph and is provided
+a full set ouf outputs when reading a result from the DF Graph
+(this highlights our intend for well behaved flow graphs, as users
+do not even get the notion that it is ok to only return partial results)
+```c++
+auto source = df.source([](String &out1, Int &out2) {
+  if (elements_avaliable) {
+    out1 = ...;
+    out2 = ...;
+    return true;
+  } else {
+    return false;
+  }
+});
+
+...
+
+auto sink = df.sink([](const &Image, const &Int){
+  ...;
+});
+```
+- Easy API's for working with array data are provided in form of
+interator sources/sinks
+```c++
+auto source = df.iterator_source();
+auto sink = df.iterator_sink();
+
+...
+
+source.reset(input.begin(), input.end());
+sink.reset(output.begin());
+df.wait_for_all();
+```
+- In the first version nodes are always fully parallel,
+further versions might include the per node property of
+unordered_serial (nodes are executed at most once, but not ordered,
+e.g. for accessing shared memory) or ordered_serial (e.g.
+for logging something in correct order to a file/console).
+- Sinks/User accessed outputs for a DF graph are always ordered_serial,
+preventing confusion for the end user and ensuring deterministic
+execution
+- Memory management for all buffers and the DF graph itself is made
+explicitly visible to the end user by enforcing him to hold all
+components of the graph in memory when using it. This keeps on with
+our phylosophy of not having hidden memory allocations, making
+development for e.g. embedded platforms simpler, as it is clear
+where and what resources are used (one can simply sizeof(...) all
+parts of the graph and find out how much memory the buffers and so on
+require)
+- This model in principle allows recursive invocation. We will not
+implement this in the first place, but keep the option for later.
+This will potentially allow different patterns, like stencil operations,
+to be implemented with the system.
+
+## 03.07.2019 - Dataflow in EMBB
+
+EMBB's dataflow is a very simple but effective implementation
+of k-colored (maximum of k concurrent invocations, data/tokens
+marked by an individual color per parallel invocation).
+They force a acyclic, recursion-free flow and force to set source
+nodes and sink nodes explicitly (acting as in- and outputs).
+
+This allows them to send signals down on arcs even if there is
+no value, e.g. if there is a split in control flow the 'not used'
+side of the flow will be fed an 'empty' token, signaling sinks
+that the execution of this parallel instance reached the sink.
+Once one color of tokens (so one parallel execution instance)
+reaches ALL sinks the model allows a new to be input.
+This force of all tokens reaching the sinks before new ones can
+entry is ordered, thus potentially limiting concurrency,
+but at the same time makes for a very simple to implement model.
+Computational nodes between sources and sinks are associated
+with input buffers (having the same capacity as the number of
+parallel invocations allowed). These can hold values from
+predecessors until all inputs for the node are ready. The node
+is started as a process as soon as the last needed input is provided
+(this event is triggered by a reference counter of missing inputs).
+
 ## 26.06.2019 - Notes on Dataflow Implementation
 
 ### Dataflow in general
--
libgit2 0.26.0