Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SIMD internally #145

Open
gnzlbg opened this issue Aug 13, 2018 · 4 comments
Open

Use SIMD internally #145

gnzlbg opened this issue Aug 13, 2018 · 4 comments

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 13, 2018

The sleef library is available with the Boost software license (extremely permissive) and implements the libm intrinsics using SIMD instructions when possible. It currently supports:

This library currently supports several SIMD architectures :

  • x86 - SSE2, SSE4.1, AVX, FMA4, AVX2+FMA3, AVX512F
  • AArch64 - Advanced SIMD, SVE
  • AArch32 - NEON
  • PowerPC64 - VSX

libm should use cfg(target_feature) (stable) internally to detect the availability of SIMD instructions at compile-time and use them when it is possible and pays off.

We could add an unstable cargo feature to libm to enable using nightly only features to detect and use asimd, neon, vsx, and AVX-512 (there are still unstable in core::arch). Some asimd, neon, vsx and AVX-512 intrinsics are already available in core::arch in nightly, and it shouldn't be too difficult to add anyone that's missing.

As long as we include the proper contribution notice, and maybe dual license the library under the Boost license (this should be easily possible), we could reuse 1:1 the sleef implementation which competes performance wise with libm and intel's libraries. There is a commit under review to add Sleef to LLVM, but it has been stalled for over a year.

@hanna-kruppe
Copy link

Just so we're clear, are you talking about the scalar functions provided by sleef (e.g., double Sleef_sin_u10(double a);) or also about the element-wise operations on vectors?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 13, 2018

are you talking about the scalar functions provided by sleef (e.g., double Sleef_sin_u10(double a);)

The scalar functions, with an error of 1 ULPs (those with the _u10 suffix).

@burrbull
Copy link
Contributor

sleef-rs
sleefdp and sleefsp files ported from sleef library
not tested yet

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 23, 2018

@burrbull the sleef-sys crate (wrapper over the C library - what we are trying to avoid here) is used by packed_simd and is tested - sleef only supports x86_64 linux, x86_64 apple, and x86_64 windows. Currently, 32-bit linux is advertised but not workin, so that heavily limits the targets where it can be used. You can probably add sleef-sys as a dev dependency to test sleef-rs against sleef in those architectures at least though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants