Commits · d5b66aba1164454989f743f32f6488a8f38e55cd · las3_pub / predictable_parallel_patterns

17 Feb, 2020 1 commit
- Change FFT benchmark to use static allocated temporary arrays. · d5b66aba
  FritzFlorian committed Feb 17, 2020
  
  d5b66aba Browse Directory
09 Feb, 2020 2 commits
- Add cscontext ARMv7 assembly and fast path optimization. · 89b6e3cb
  FritzFlorian committed Feb 09, 2020
  
  89b6e3cb Browse Directory
- First working version on both ARM and x86. · 3c60e8d7
  FritzFlorian committed Feb 09, 2020
  
  3c60e8d7 Browse Directory
05 Feb, 2020 2 commits
- WIP: Remove unneeded attributes from scheduler. · 731b47c5
  FritzFlorian committed Feb 05, 2020
  
  731b47c5 Browse Directory
- WIP: Add minimal example to trigger tsan bugs. · 7c8d7a83
  FritzFlorian committed Feb 05, 2020
  
  7c8d7a83 Browse Directory
03 Feb, 2020 1 commit
- WIP: Add fcontext and thread sanitizer support to our coroutine abstraction. · c25e6134
  FritzFlorian committed Feb 03, 2020
  
  c25e6134 Browse Directory
30 Jan, 2020 4 commits
- WIP: temporally fix cmake issues with export tangets. · 4a44ad9f
```
Older CMAKE versions wont work with export targets in different directories. For now we simply add the context_switcher manually to the export target of pls.
```
  FritzFlorian committed Jan 30, 2020
  4a44ad9f Browse Directory
- Fix matrix multiplication benchmark for new scheduler. · 7796022f
  FritzFlorian committed Jan 30, 2020
  
  7796022f Browse Directory
- Change context_switch app to use context_switcher library. · 01596ff3
  FritzFlorian committed Jan 30, 2020
  
  01596ff3 Browse Directory
- WIP: clean through scheduler code and fix obvious issues. · 6ee522a3
```
We still see very sporadic crashes, however the current version is at least a starting point for refactoring and debugging. Next steps have to be to re-enable tooling support (i.e. add code to let sanitizers do their work).
```
  FritzFlorian committed Jan 30, 2020
  6ee522a3 Browse Directory
29 Jan, 2020 1 commit

WIP: Add simple external trading deque test. · 2adb2d16

The current version has race conditions and is hard to debug (especially because of the fibers, if a wrong thread executes on a fiber we get segfalts very fast). To combat this mess we now refactor the code bit by bit while also adding tests where it can be done with reasonably effort).

committed Jan 29, 2020

2adb2d16 Browse Directory

27 Jan, 2020 2 commits

WIP: First running version of stealing. · 22f4c598

The project is currently really messy and there are sporadic sigsevs. This indicates that we still have a race in our code. Thread Sanitizer does not work with our current implementation, as it needs annotations for fibers.

The next step is to clean up the project and maybe add thread sanitizer support to our fiber implementation. This should help finding the remaining bugs.

committed Jan 27, 2020

22f4c598 Browse Directory

WIP: first stable version of stealing outline. · c85f2d0f
FritzFlorian committed Jan 27, 2020

c85f2d0f Browse Directory

26 Jan, 2020 1 commit
- Sketch out deque sync in case of a steal. · eecbe38d
  FritzFlorian committed Jan 26, 2020
  
  eecbe38d Browse Directory
24 Jan, 2020 2 commits

Skecht 'externaly trading task dequeue'. · 0141a57a

The deque trades tasks when stealing. Right now only the fast local path is tested and implemented. For the next step to work we also need to add the resource stack and resource tarding to the system.

committed Jan 24, 2020

0141a57a Browse Directory

Sketch minimal serial calling sequence. · 625836aa

The current state shows the minimum actions taken to execute a parallel call: get the thread local, find the active frame, execute on the next frame and return to the active frame.

committed Jan 24, 2020

625836aa Browse Directory

23 Jan, 2020 4 commits
- Draft of new context switching tasks. · 83c6e622
  FritzFlorian committed Jan 23, 2020
  
  83c6e622 Browse Directory
- Add plots to context switch benchmarks. · 5e0ce1f5
  FritzFlorian committed Jan 23, 2020
  
  5e0ce1f5 Browse Directory
- Add custom context switch library. · e2092e63
```
The rationale to do an custom implementation is that the existing solutions are quite a bit slower and/or require more memory.
```
  FritzFlorian committed Jan 23, 2020
  e2092e63 Browse Directory
- Add benchmark results from running on x86 and arm32. · af75e21a
  FritzFlorian committed Jan 23, 2020
  
  af75e21a Browse Directory
22 Jan, 2020 1 commit

Implement custom fast call fiber for arm32. · 3f7b5ad0

The basic calling works, next we measure on both x86 and arm and then decide on how we implement our fiber/'staggered stack' abstraction.

committed Jan 22, 2020

3f7b5ad0 Browse Directory

21 Jan, 2020 1 commit

Add custom 'fast fiber call' implementation to comparison. · 5b791d0e

We now cover all implementations that have a chance of being fast.
ARM implementations for our 'fast fiber call' are still missing. After we add them we decide on how to proceed.

committed Jan 21, 2020

5b791d0e Browse Directory

20 Jan, 2020 1 commit
- Add alternative context switch implementation (similar to boost). · 3a6b724d
  FritzFlorian committed Jan 20, 2020
  
  3a6b724d Browse Directory
13 Jan, 2020 1 commit
- Extend context switch example for arm32. · 540fb8ed
  FritzFlorian committed Jan 13, 2020
  
  540fb8ed Browse Directory
10 Jan, 2020 1 commit

Add minimal example for x86_64 user level threads. · 5490e966

We implement a minimal concepts of user level threads. This shows the minimum requirements for our 'staggered' stack implementation: we need to be able to switch to a new stack and allow someone else to continue the calling function right before the switch.

committed Jan 10, 2020

5490e966 Browse Directory

04 Jan, 2020 1 commit
- Add fib benchmark. · d054e1ab
  FritzFlorian committed Jan 04, 2020
  
  d054e1ab Browse Directory
20 Dec, 2019 1 commit
- Add two 'standardized' benchmarks. · 79ac0243
  FritzFlorian committed Dec 20, 2019
  
  79ac0243 Browse Directory
05 Dec, 2019 1 commit

Minor changes for profiling and add more alignment. · 2f539691

The idea is to exclude as many sources as possible that could lead to issues with contention and cache misses. After some experimentation, we think that hyperthreading is simply not working very well with our kind of workload. In the future we might simply test on other hardware.

committed Dec 05, 2019

2f539691 Browse Directory

04 Dec, 2019 1 commit
- Working version of our trading-deque · e34ea267
  FritzFlorian committed Dec 04, 2019
  
  e34ea267 Browse Directory
29 Nov, 2019 3 commits

First 'crash free' version. · 1b576824

This version runs through our initial fft and fib tests. However, it is not tested further in any way. Additionally, we added a locking deque, potentially hurting performance and moving away from our initial goal.

committed Nov 29, 2019

1b576824 Browse Directory

WIP: Partly functional version. Stealing and continuation tarding works 'most' of the time. · c6dd2fc0

The main issue seems to still be the fact that we have a lock free protocol where a steal can be pending. We plan to remove this fact next by introducing a protocol that works on a single atomic update.

committed Nov 29, 2019

c6dd2fc0 Browse Directory

WIP: We plan to fully remove the start property from the cont manager. · adf05e9a

The start_chain property does not make sense, as chains are purely 'virtual', i.e. they only fully exist when walking through the computation (by patching them on important events). We initially added the property as a helper for better runtime and simpler implementation, but we think without it we will not get as much inconsistency in the runtime state. Performance can be 're-added' later on.

committed Nov 29, 2019

adf05e9a Browse Directory

27 Nov, 2019 2 commits
- WIP: Major flaws fixed. Edge cases at beginning missing and cleanup for conts missing. · 21733e4c
  FritzFlorian committed Nov 27, 2019
  
  21733e4c Browse Directory
- WIP: Refactor memory manager to reduce redundancy. · 69fd7e0c
```
It is still not working, however we now have no more redundant code, making debugging it simpler.
```
  FritzFlorian committed Nov 27, 2019
  69fd7e0c Browse Directory
25 Nov, 2019 1 commit

WIP: Add first performance tests of single threaded execution. · 8668cad2

We changed up some of the memory constraints in the lock free deque and will need to see if this is ok. If so, the single threaded performance looks very good.

committed Nov 25, 2019

8668cad2 Browse Directory

19 Nov, 2019 1 commit

WIP: Fast path with 'outlined' slow path code in place. · c2d4bc25

Everything so far is untested. We only made sure tha fast path still seems to function correctly. Next up is writing tests for both the fast and slow path to then introduce the slow path. After that we can look at performance optimizations.

committed Nov 19, 2019

c2d4bc25 Browse Directory

07 Nov, 2019 1 commit

WIP: First implementation of serial/fast path. · 842b518f

This showcases the expected performance when a task executes a sub-tree without inference from other threads. We target to stay about 6x slower than a normal function call.

committed Nov 07, 2019

842b518f Browse Directory

01 Oct, 2019 1 commit
- Rework spawn ordering/waiting on pipeline tasks · eca0dd4d
  FritzFlorian committed Oct 01, 2019
  
  eca0dd4d Browse Directory
16 Sep, 2019 1 commit
- Fix: App used old threading interface. · ef19ea1b
  FritzFlorian committed Sep 16, 2019
  
  ef19ea1b Browse Directory
01 Aug, 2019 1 commit
- Change both stack and queue to same offset counters. · e403e498
```
This allows the stack and deque class to use the same offset, making it work better with each other.
```
  FritzFlorian committed Aug 01, 2019
  e403e498 Browse Directory