Skip to content
Toggle navigation
P
Projects
G
Groups
S
Snippets
Help
las3_pub
/
predictable_parallel_patterns
This project
Loading...
Sign in
Toggle navigation
Go to a project
Project
Repository
Issues
0
Merge Requests
0
Pipelines
Wiki
Members
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Commit
a4b03ffe
authored
Jun 06, 2019
by
FritzFlorian
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Found general problem for FFT performance.
parent
5044f0a1
Pipeline
#1253
failed with stages
in 3 minutes 38 seconds
Changes
2
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
45 additions
and
2 deletions
+45
-2
PERFORMANCE.md
+37
-0
lib/pls/include/pls/algorithms/invoke_parallel_impl.h
+8
-2
No files found.
PERFORMANCE.md
View file @
a4b03ffe
...
...
@@ -318,3 +318,40 @@ threads (threads without any actual work) and the threads actually
performing work. Most likely there is a resource on the same cache
line used that hinders the working threads, but we can not really
figure out which one it is.
### Commit be2cdbfe - Locking Deque
Switching to a locking deque has not improved (or even slightly hurt)
performance, we therefore think that the deque itself is not the
portion slowing down our execution.
### Commit 5044f0a1 - Performance Bottelneck in FFT FIXED
By moving from directly calling one of the parallel invocations
```
c++
scheduler
::
spawn_child
(
sub_task_2
);
function1
();
// Execute first function 'inline' without spawning a sub_task object
```
to spawning two tasks
```
c++
scheduler
::
spawn_child
(
sub_task_2
);
scheduler
::
spawn_child
(
sub_task_1
);
```
we where able to fix the bad performance of our framework in the
FFT benchmark (where there is a lot spinning/idling of some
worker threads).
We think this is due to some sort of cache misses/bus contemption
on the finishing counters. This would make sense, as the drop
at the hyperthreading mark indicates problems with this part of the
CPU pipeline (althought it did not show clearly in our profiling runs).
We will now try to find the exact spot where the problem originates and
fix the source rather then 'circumventing' it with these extra tasks.
(This then aigain, should hopefully even boost all other workloads
performance, as contemption on the bus/cache is always bad)
lib/pls/include/pls/algorithms/invoke_parallel_impl.h
View file @
a4b03ffe
...
...
@@ -13,10 +13,13 @@ template<typename Function1, typename Function2>
void
invoke_parallel
(
const
Function1
&
function1
,
const
Function2
&
function2
)
{
using
namespace
::
pls
::
internal
::
scheduling
;
auto
sub_task_1
=
lambda_task_by_reference
<
Function1
>
(
function1
);
auto
sub_task_2
=
lambda_task_by_reference
<
Function2
>
(
function2
);
scheduler
::
spawn_child
(
sub_task_2
);
function1
();
// Execute first function 'inline' without spawning a sub_task object
scheduler
::
spawn_child
(
sub_task_1
);
// TODO: Research the exact cause of this being faster
// function1(); // Execute first function 'inline' without spawning a sub_task object
scheduler
::
wait_for_all
();
}
...
...
@@ -24,12 +27,15 @@ template<typename Function1, typename Function2, typename Function3>
void
invoke_parallel
(
const
Function1
&
function1
,
const
Function2
&
function2
,
const
Function3
&
function3
)
{
using
namespace
::
pls
::
internal
::
scheduling
;
auto
sub_task_1
=
lambda_task_by_reference
<
Function1
>
(
function1
);
auto
sub_task_2
=
lambda_task_by_reference
<
Function2
>
(
function2
);
auto
sub_task_3
=
lambda_task_by_reference
<
Function3
>
(
function3
);
scheduler
::
spawn_child
(
sub_task_2
);
scheduler
::
spawn_child
(
sub_task_3
);
function1
();
// Execute first function 'inline' without spawning a sub_task object
scheduler
::
spawn_child
(
sub_task_1
);
// TODO: Research the exact cause of this being faster
// function1(); // Execute first function 'inline' without spawning a sub_task object
scheduler
::
wait_for_all
();
}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment