Building fuser project

How to build fuser project from source.

Build issue is tracked here: https://github.com/NVIDIA/Fuser/issues/101. If you run into problems, open an issue and link it against that.

Option 1 - build nvfuser as part of pytorch's subdirectory (submodule build):

~~clone pytorch code base to your local machine. https://github.com/pytorch/pytorch (we do nightly build against pytorch upstream. viable/strict branch is recommended for stable CI.)~~
~~clone fuser code base to your local machine. https://github.com/NVIDIA/Fuser (default to main, which is our development branch)~~
Build pytorch from the [PATH_TO_PYTORCH], with one environment variable export NVFUSER_SOURCE_DIR=[PATH_TO_FUSER] to use the latest fuser code base, replacing the submodule fuser code that's shipped with pytorch under [PATH_TO_PYTORCH]/third_party/nvfuser.

Option 2 - build nvfuser separately from PyTorch (standalone build):

Install pytorch (either from source or via pip wheel);
clone fuser code to your local machine https://github.com/NVIDIA/Fuser;
Update nvfuser submodule: git submodule sync --recursive; git submodule update --init --recursive;
Install required pip modules: pip install -r requirements.txt;
Build nvfuser with either python setup.py install or python setup.py develop;

A few note on pytorch standalone build:

~~nightly pip from pytorch currently doesn't work for our cpp tests: https://github.com/pytorch/pytorch/issues/98093~~
there was some packaging issue with pytorch that has been patched (https://github.com/pytorch/pytorch/pull/97404), So as of today 3/31/2023, nightly pip wheel should work, but pytorch 2.0 pip wheel doesn't.
if you want to use python setup.py install or python setup.py develop, currently there's some packaging issue where the extension module is not picked up by the package. So you might want to run the command twice. See issue Missing C.cpython-39-x86_64-linux-gnu.so after first install with clean build
If you want to specify the C++ standard, use --cpp=17 for C++17, and --cpp=20 for C++20.
setup.py is also being deprecated. pip install -v --no-build-isolation --config-settings --build_option="--debug" -e . This is what I used locally for debug build with pip editable installtion.

Note for developers

For PRs changing build system. We kindly ask you to verify it in all three uses:

submodule build;
standalone build against your locally build pytorch;
[somewhat optional] standalone build against a pip installed pytorch; (note that since upstream pip package has issues, you might want to build a pip wheel locally and use that instead)

Additionally, maybe also check the build tracking PR and see if you can check one of those boxes and put a link there to claim the credit!: https://github.com/NVIDIA/Fuser/issues/101

Pip wheel build and usage

After finish standalone build step 1-5 above, you can run python setup.py bdist_wheel to build a pip wheel. Which can be distributed and used on top of a pip installed pytorch wheel pacakge. A few notes:

Build against upstream pip package - if you need to work with upstream distributed pytorch https://pytorch.org/, you need to make sure that you nvfuser is built against a pytorch library with the same CXX ABI flag, otherwise, you are going to see undefined symbols. The safest bet is just build against upstream pip package directly!
Specify additional pip packages - currently upstream pytorch pip package can run on system with no cuda installation, there's some complicated story on how we link against libnvrtc. Long story short, you need to specify the proper nvrtc as required pacakge. This can be done via specify -install_requires=... in the setup.py script. i.e. python setup.py bdist_wheel --no-test --no-benchmark -install_requires=nvidia-cuda-nvrtc-cu12 (note that I'm also skipping test & benchmark during the build since those are not packaged by pip wheel neither).
Patch nvfuser binary for pip installation - after nvfuser pip package is installed on your local machine, we need to swap nvfuser library targets with what upstream pytorch ships. Simply running patch-nvfuser (which is defined as a wheel entry_point) would be enough

Provide feedback

Saved searches