Skip to content
This repository has been archived by the owner on Nov 24, 2022. It is now read-only.

Implement parallel ahc-ld/ahc-link #621

Open
TerrorJack opened this issue Apr 27, 2020 · 0 comments · May be fixed by #622
Open

Implement parallel ahc-ld/ahc-link #621

TerrorJack opened this issue Apr 27, 2020 · 0 comments · May be fixed by #622
Assignees

Comments

@TerrorJack
Copy link
Member

TerrorJack commented Apr 27, 2020

Is your feature request related to a problem? Please describe.
Recent performance improvements has resulted in over 50% improvement in wall clock time when linking large programs. For further improvements, we should introduce parallelism in ahc-ld and ahc-link. There are multiple chances of parallelism which is explained in the next section.

Describe the solution you'd like

Chances of parallelism

  • When loading archives and object files in ahc-ld, we can parallelize the deserialization of each object file. All object files are converted to ByteStrings first, either via direct reading or ArchiveEntry, then the deserialization can be performed in parallel.
  • After the gc-sections pass is run, the shrinked AsteriusModule should be fully evaluated, and this can be done in parallel as well.
  • In the binaryen backend, we can parallelize the marshaling of different data segments and functions. binaryen will transparently switch to a new allocator when it notices it's allocating an IR node on a different thread, so we should ensure each Haskell worker thread is pinned using forkOn.

Method of parallelism

We cannot introduce additional dependencies like parallel, monad-par or scheduler here, since we need to strictly control our dependency surface. So we need to roll our minimal parallelism framework first.

The need for nested parallelism can be avoided for our use cases. A simple parallel loop should be sufficient:

parallelFor :: Monoid r => Int -> [a] -> (a -> IO r) -> IO r

The first argument is the worker thread pool capacity, which should be equivalent to CPU core number.

In addition, we should implement a link-time option for ahc-ld/ahc-link to allow overriding the worker thread pool size; specifying it to 1 should fallback to sequential code to avoid threading overhead.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants