diff --git a/PERFORMANCE.md b/PERFORMANCE.md index a88b3ef..f9cc49e 100644 --- a/PERFORMANCE.md +++ b/PERFORMANCE.md @@ -66,3 +66,23 @@ Additionaly, the first one uses our high level API (parallel invoke), while the second one uses our low level API. It is worth investigating if either or high level API or the structure of the memory access in FFT are the problem. + +### Commit cf056856 - Remove two-level scheduler + +In this test we replace the two level scheduler with ONLY fork_join +tasks. This removes the top level steal overhead and performs only +internal stealing. For this we set the fork_join task as the only +possible task type and removed the top level rw-lock, the digging +down to our level and solely use internal stealing. + +Average results FFT: + + + +Average results Unbalanced: + + + +There seems to be only a minor performance difference between the two, +suggesting tha our two-level approach is not the part causing our +weaker performance. diff --git a/media/cf056856_fft_average.png b/media/cf056856_fft_average.png new file mode 100644 index 0000000..ec55027 Binary files /dev/null and b/media/cf056856_fft_average.png differ diff --git a/media/cf056856_unbalanced_average.png b/media/cf056856_unbalanced_average.png new file mode 100644 index 0000000..75d2829 Binary files /dev/null and b/media/cf056856_unbalanced_average.png differ