From 938be84f1f080a7a6781ce1f8f873282c9ee2dbf Mon Sep 17 00:00:00 2001
From: FritzFlorian <flo.fritz@t-online.de>
Date: Fri, 28 Jun 2019 10:32:42 +0200
Subject: [PATCH] Add notes on Dataflow

---
 NOTES.md | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/NOTES.md b/NOTES.md
index ef4845c..1a70ff5 100644
--- a/NOTES.md
+++ b/NOTES.md
@@ -4,6 +4,72 @@ A collection of stuff that we noticed during development.
 Useful later on two write a project report and to go back
 in time to find out why certain decisions where made.
 
+## 26.06.2019 - Notes on Dataflow Implementation
+
+### Dataflow in general
+
+Dataflow based programming is nothing unique to task based
+programming, it rather is a own programming paradigma and using
+it inside a task scheduler is simply an adoption of the general
+idea of organizing programs by flow of available data.
+
+Therefore, we first look at the general domain of dataflow
+programming, to understand the basic concepts.
+
+The work in \[1] gives a good overview of dataflow programming
+before 2000. It presents the basic execution concept
+and the idea of data driving program execution.
+Two main ways of execution can be distinguished: static and dynamic
+execution. Static execution allows only one token per arc in the
+execution graph, making it simple, but do not allow a lot of
+parallel execution. Dynamic execution allows an unbounded number
+of tokens per edge, making it harder to implment, but allowing
+unlimited parallelism. There are also models allowing a maximum
+of k tokens at an arch, which is probably where we are going as
+we will try to keep memory allocation static once again.
+
+Normally, dataflow programming is seen as on pure execution model,
+meaning that everything is dataflow, even expressions like
+`x = y + z` are executed with dataflow and the idea is to create
+special hardware to execute such dependency graphs.
+We are interested in tasks coupled by dataflow dependencies.
+\[1] mentions this under names like threaded, hybrid or large-grain
+dataflow.
+
+Looking further into these groase-grained dataflow models we
+discovered \[2] as an good overview of this area.
+The paper focuses mainly no implementations that also rely on
+special hardware, but the general principles hold for our implementation.
+Our kind of dataflow falls under the flow/instruction category:
+high-level coordination is achieved using dataflow,
+individual work items are tasks in an imperative language.
+We need to further see if ideas from these languages are helpful to us,
+maybe individual papers can give some clues if we need them later.
+
+For now we can conclude that we will probably implement some sort
+of k-limited, dynamic system (using tokens with different ID's).
+
+
+\[1] W. M. Johnston, J. R. P. Hanna, and R. J. Millar, “Advances in dataflow programming languages,” ACM Computing Surveys, vol. 36, no. 1, pp. 1–34, Mar. 2004.
+
+\[2] F. Yazdanpanah, C. Alvarez-Martinez, D. Jimenez-Gonzalez, and Y. Etsion, “Hybrid Dataflow/von-Neumann Architectures,” IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 6, pp. 1489–1509, Jun. 2014.
+
+### Dataflow in TBB and EMBB
+
+TBB's dataflow implementation follows (to what we can see) no
+general dataflow theory, and implements an own method.
+It relies on explicit concepts of buffering, multiple arc's
+ending in the same node, concepts of pullin gand pushing modes for
+arcs, ...
+We think this is overly complicated, and too far away from
+the classic model.
+
+EMBB seems to follow a token based models with id's distinguishing
+tokens belonging to different parallel executions.
+It uses arcs that can buffer a limited number of data items.
+Overall this seems rather close to what we have in mind and we will
+further look into it.
+
 ## 24.06.2019 - Further Benchmark Settings and First Results
 
 As a further option to reduce potential inference of our benchmark
--
libgit2 0.26.0