Commits · 872c1a72025b47d86d4852f0159ea220644044b1 · las3_pub / predictable_parallel_patterns

24 Mar, 2020 1 commit

Add different stack allocators. · 872c1a72

It is now possible to use a memory mapped stack that throws a SIGSEV if thes coroutine stacks are exhausted.

committed Mar 24, 2020

872c1a72 Browse Directory

18 Mar, 2020 1 commit

Rework memory allocation to heap based RAII objects. · 927d8ac5

Remove the strict static memory allocation scheme in favour of placing objects on the heap at startup. This still keeps the requirements posed for modern, high performance embedded systems, but makes APIs a lot cleaner.

committed Mar 18, 2020

927d8ac5 Browse Directory

13 Mar, 2020 2 commits

Traded cas field uses platform independent data type. · 08b941fc

The cas size could exceed an unsigned long, so we use the correct cas_integer type for the traded_cas_field representations.

committed Mar 13, 2020

08b941fc Browse Directory

Change tsan integration to not 'cache' fibers. · f8ab8e0a

We fixed the bug in tsan causing it to crash after creating/deleting many fibers, because of that there is no need for this cache mechanism (you have to use the most recent clang build with the patch for it to work thought).

committed Mar 13, 2020

f8ab8e0a Browse Directory

23 Feb, 2020 1 commit

Add yielding to scheduler loop. · 1e1e08a9

We yield after num_thread failed steals in a row. This parameter can be tuned for better performance, but we stick to a sensible default just to prevent massive spinning.

committed Feb 23, 2020

1e1e08a9 Browse Directory

09 Feb, 2020 2 commits
- Add cscontext ARMv7 assembly and fast path optimization. · 89b6e3cb
  FritzFlorian committed Feb 09, 2020
  
  89b6e3cb Browse Directory
- First working version on both ARM and x86. · 3c60e8d7
  FritzFlorian committed Feb 09, 2020
  
  3c60e8d7 Browse Directory
05 Feb, 2020 2 commits

WIP: Remove unneeded attributes from scheduler. · 731b47c5
FritzFlorian committed Feb 05, 2020

731b47c5 Browse Directory

WIP: Add workaround for tsan short lived fiber support. · f347849f

Tsan does not cope well with rapidely destroyed/created fibers. As it is currently too much effort to fully investigate the tsan issue we work around it by caching the shourt lived fibers based on their stack base address. This allows us to use thread sanitizer for now.

committed Feb 05, 2020

f347849f Browse Directory

03 Feb, 2020 1 commit
- WIP: Add fcontext and thread sanitizer support to our coroutine abstraction. · c25e6134
  FritzFlorian committed Feb 03, 2020
  
  c25e6134 Browse Directory
30 Jan, 2020 3 commits

WIP: temporally fix cmake issues with export tangets. · 4a44ad9f

Older CMAKE versions wont work with export targets in different directories. For now we simply add the context_switcher manually to the export target of pls.

committed Jan 30, 2020

4a44ad9f Browse Directory

Fix matrix multiplication benchmark for new scheduler. · 7796022f
FritzFlorian committed Jan 30, 2020

7796022f Browse Directory

WIP: clean through scheduler code and fix obvious issues. · 6ee522a3

We still see very sporadic crashes, however the current version is at least a starting point for refactoring and debugging. Next steps have to be to re-enable tooling support (i.e. add code to let sanitizers do their work).

committed Jan 30, 2020

6ee522a3 Browse Directory

29 Jan, 2020 1 commit

WIP: Add simple external trading deque test. · 2adb2d16

The current version has race conditions and is hard to debug (especially because of the fibers, if a wrong thread executes on a fiber we get segfalts very fast). To combat this mess we now refactor the code bit by bit while also adding tests where it can be done with reasonably effort).

committed Jan 29, 2020

2adb2d16 Browse Directory

27 Jan, 2020 2 commits

WIP: First running version of stealing. · 22f4c598

The project is currently really messy and there are sporadic sigsevs. This indicates that we still have a race in our code. Thread Sanitizer does not work with our current implementation, as it needs annotations for fibers.

The next step is to clean up the project and maybe add thread sanitizer support to our fiber implementation. This should help finding the remaining bugs.

committed Jan 27, 2020

22f4c598 Browse Directory

WIP: first stable version of stealing outline. · c85f2d0f
FritzFlorian committed Jan 27, 2020

c85f2d0f Browse Directory

26 Jan, 2020 2 commits
- Fix: add missing .cpp file to last commit. · 6027f7be
  FritzFlorian committed Jan 26, 2020
  
  6027f7be Browse Directory
- Sketch out deque sync in case of a steal. · eecbe38d
  FritzFlorian committed Jan 26, 2020
  
  eecbe38d Browse Directory
24 Jan, 2020 2 commits

Skecht 'externaly trading task dequeue'. · 0141a57a

The deque trades tasks when stealing. Right now only the fast local path is tested and implemented. For the next step to work we also need to add the resource stack and resource tarding to the system.

committed Jan 24, 2020

0141a57a Browse Directory

Sketch minimal serial calling sequence. · 625836aa

The current state shows the minimum actions taken to execute a parallel call: get the thread local, find the active frame, execute on the next frame and return to the active frame.

committed Jan 24, 2020

625836aa Browse Directory

23 Jan, 2020 2 commits
- Draft of new context switching tasks. · 83c6e622
  FritzFlorian committed Jan 23, 2020
  
  83c6e622 Browse Directory
- Add custom context switch library. · e2092e63
```
The rationale to do an custom implementation is that the existing solutions are quite a bit slower and/or require more memory.
```
  FritzFlorian committed Jan 23, 2020
  e2092e63 Browse Directory
20 Dec, 2019 1 commit
- Add two 'standardized' benchmarks. · 79ac0243
  FritzFlorian committed Dec 20, 2019
  
  79ac0243 Browse Directory
05 Dec, 2019 1 commit

Minor changes for profiling and add more alignment. · 2f539691

The idea is to exclude as many sources as possible that could lead to issues with contention and cache misses. After some experimentation, we think that hyperthreading is simply not working very well with our kind of workload. In the future we might simply test on other hardware.

committed Dec 05, 2019

2f539691 Browse Directory

04 Dec, 2019 1 commit
- Working version of our trading-deque · e34ea267
  FritzFlorian committed Dec 04, 2019
  
  e34ea267 Browse Directory
02 Dec, 2019 1 commit
- Sketch out idea for lock free trading deque. · 4cf3848f
  FritzFlorian committed Dec 02, 2019
  
  4cf3848f Browse Directory
29 Nov, 2019 3 commits

First 'crash free' version. · 1b576824

This version runs through our initial fft and fib tests. However, it is not tested further in any way. Additionally, we added a locking deque, potentially hurting performance and moving away from our initial goal.

committed Nov 29, 2019

1b576824 Browse Directory

WIP: Partly functional version. Stealing and continuation tarding works 'most' of the time. · c6dd2fc0

The main issue seems to still be the fact that we have a lock free protocol where a steal can be pending. We plan to remove this fact next by introducing a protocol that works on a single atomic update.

committed Nov 29, 2019

c6dd2fc0 Browse Directory

WIP: We plan to fully remove the start property from the cont manager. · adf05e9a

The start_chain property does not make sense, as chains are purely 'virtual', i.e. they only fully exist when walking through the computation (by patching them on important events). We initially added the property as a helper for better runtime and simpler implementation, but we think without it we will not get as much inconsistency in the runtime state. Performance can be 're-added' later on.

committed Nov 29, 2019

adf05e9a Browse Directory

27 Nov, 2019 2 commits
- WIP: Major flaws fixed. Edge cases at beginning missing and cleanup for conts missing. · 21733e4c
  FritzFlorian committed Nov 27, 2019
  
  21733e4c Browse Directory
- WIP: Refactor memory manager to reduce redundancy. · 69fd7e0c
```
It is still not working, however we now have no more redundant code, making debugging it simpler.
```
  FritzFlorian committed Nov 27, 2019
  69fd7e0c Browse Directory
25 Nov, 2019 1 commit

WIP: Add first performance tests of single threaded execution. · 8668cad2

We changed up some of the memory constraints in the lock free deque and will need to see if this is ok. If so, the single threaded performance looks very good.

committed Nov 25, 2019

8668cad2 Browse Directory

19 Nov, 2019 1 commit

WIP: Fast path with 'outlined' slow path code in place. · c2d4bc25

Everything so far is untested. We only made sure tha fast path still seems to function correctly. Next up is writing tests for both the fast and slow path to then introduce the slow path. After that we can look at performance optimizations.

committed Nov 19, 2019

c2d4bc25 Browse Directory

07 Nov, 2019 1 commit

WIP: First implementation of serial/fast path. · 842b518f

This showcases the expected performance when a task executes a sub-tree without inference from other threads. We target to stay about 6x slower than a normal function call.

committed Nov 07, 2019

842b518f Browse Directory

06 Nov, 2019 3 commits
- WIP: Sketch fast path of task manager. · 39d2fbd8
  FritzFlorian committed Nov 06, 2019
  
  39d2fbd8 Browse Directory
- WIP: Initialization of continuation chains. · d3b64a85
  FritzFlorian committed Nov 06, 2019
  
  d3b64a85 Browse Directory
- WIP: Sketch continuation and taks class. · 740ae661
```
This first sketch of the classes captures what we think is needed in terms of general interface and very mich WIP.
```
  FritzFlorian committed Nov 06, 2019
  740ae661 Browse Directory
05 Nov, 2019 1 commit

WIP: re-work static memory allocation for scheduler. · 693d4e9b

We changed how the memory is allocated from passing char* buffers to then store objects into to creating 'fat objects' for all scheduler state. This eases development for us, as we can make changes to data structures without too much effort (e.g. add a second array to manage tasks if required).

committed Nov 05, 2019

693d4e9b Browse Directory

02 Oct, 2019 1 commit

Add deconstructor calls to tasks. · 5bc35f9e

Our stack is not calling deconstructors of its elements. This is problematic for e.g. the graph implementation where reference counted images are hold in tasks. To solve this for now we manually call the deconstructor after each tasks (we do so, because a generic, virtual deconstructor adds runtime costs to primitive tasks, requiring us to re-run all benchmarks; with this change we do not need to do this and as we re-work the scheduler anyways we postpone a clean implementation for then).

committed Oct 02, 2019

5bc35f9e Browse Directory

01 Oct, 2019 1 commit
- Rework spawn ordering/waiting on pipeline tasks · eca0dd4d
  FritzFlorian committed Oct 01, 2019
  
  eca0dd4d Browse Directory