GPU support #356

mhrmsn · 2024-08-30T16:53:57Z

As I recently looked into this and after discussion with @AdrianSosic and @Scienfitz, here's my observations:

I think there are two convenient ways to support GPUs, either by allowing the user to use torch.set_default_device("cuda") or by adding a configuration variable within the package, something like baybe.options.device = "cuda". I only tested the first one, I think the latter is a bit more complex, but would potentially allow to have different parts of code in the overall BayBe workflow to use different devices (this may be useful if you are generating embeddings from a torch-based neural network for use in BayBe etc.)

When experimenting with torch.set_default_device("cuda"), I noticed that the devices for the tensors are not consistently set in BayBE. For either solution I think these points would need to be addressed:

torch.from_numpy() calls that are used to construct botorch inputs share the same memory as the input, same goes for torch.frombuffer() (not used in BayBE). See also: https://pytorch.org/docs/stable/generated/torch.set_default_device.html
Recommendations are often converted by using a construct like pd.DataFrame(points, ...) where points is a tensor from botorch. This will fail if the tensor is not on the CPU.
Botorch's handling of constraints might not be consistent with the torch device, I opened an issue here to ask about this: Consistent use of torch device in ‎HitAndRunPolytopeSampler pytorch/botorch#2500

The text was updated successfully, but these errors were encountered:

mhrmsn · 2024-09-02T07:49:07Z

Last point is now addressed in this PR: pytorch/botorch#2502

AdrianSosic · 2024-09-03T09:28:11Z

Hi @mhrmsn, thanks for investigating and raising the botorch issue 🥇

Solid GPU support is definitely something we should target and in particular also test – which brings up yet another point to think about. That involves both figuring out what are reasonable GPU tests and also how/where we can run them (AFAIK Github now offers GPU runners, but I have no clue if we can get easy access).

Regarding the other points you mentions:

Device: yes, we probably need one thorough screening of the codebase to see where the cpu<->gpu movements need to be introduced
Configuration: I think both options are fine. The torch.set_default_device("cuda") is something user can then easily do once the above points are addressed. And agree: I've already thought for quite a while to add a "settings" mechanism (also to control other things like floating point precision, etc). I have a few ideas in mind but haven't really had the time to implement it yet. Perhaps we can talk about it once I'm back from vacation?

Scienfitz added the enhancement Expand / change existing functionality label Aug 30, 2024

AdrianSosic mentioned this issue Oct 16, 2024

Would a GPU make this code run faster? #408

Closed

Scienfitz assigned AVHopp Nov 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU support #356

GPU support #356

mhrmsn commented Aug 30, 2024

mhrmsn commented Sep 2, 2024

AdrianSosic commented Sep 3, 2024

GPU support #356

GPU support #356

Comments

mhrmsn commented Aug 30, 2024

mhrmsn commented Sep 2, 2024

AdrianSosic commented Sep 3, 2024