From a78ff0d179f487e5509d4b3ce3444971e96be9fa Mon Sep 17 00:00:00 2001 From: FritzFlorian Date: Tue, 28 May 2019 14:46:58 +0200 Subject: [PATCH] Add further notes on CPI performance. --- PERFORMANCE.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/PERFORMANCE.md b/PERFORMANCE.md index dfe7d0c..343c880 100644 --- a/PERFORMANCE.md +++ b/PERFORMANCE.md @@ -155,3 +155,21 @@ Best case Native: Average case Native: + +What we find very interesting is, that the best case times of our +pls library are very fast (as good as TBB), but the average times +drop badly. We currently do not know why this is the case. + +### Commit afd0331b - Intel VTune Amplifier + +We did serval measurements with intel's VTune Amplifier profiling +tool. The main thing that we notice is, that the cycles per instruction +for our useful work blocks increase, thus requiring more CPU time +for the acutal useful work. + +We also measured an implementation using TBB and found no significante +difference, e.g. TBB also has a higher CPI with 8 threads. +Our conclusion after this long hunting for performance is, that we +might just be bound by some general performance issues with our code. +The next step will therefore be to read the other frameworks and our +code carefully, trying to find potential issues. -- libgit2 0.26.0