Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to specify a given SYCL device programmatically #21

Open
ogrisel opened this issue Sep 28, 2022 · 2 comments
Open

Make it possible to specify a given SYCL device programmatically #21

ogrisel opened this issue Sep 28, 2022 · 2 comments

Comments

@ogrisel
Copy link
Collaborator

ogrisel commented Sep 28, 2022

At the moment the only way to select a given device (cpu, gpu, with level zero or OpenCL or default host device) is via the SYCL_DEVICE_FILTER environment variable. It would be convenient to be able to select a device programmatically (from within a running Python program, e.g. in a notebook) instead.

However I am not sure if and how we should extend the engine API to allow for this. One way to do that would be to do that would be to allow engine provider names to include an extra string spec to select the device:

with config_context(engine_provider="sklearn_numba_dpex:opencl:gpu:0"):
    model.fit(X_train, y_train)

alternatively we could allow for a separate config param named engine_device for instance, in which case we could directly accept dpctl device instance for sycl engine providers.

device = dpctl.SyclDevice("level_zero:gpu:2")
with config_context(engine_provider="sklearn_numba_dpex", engine_device=device):
    model.fit(X_train, y_train)

A similar problem will happen with different device specifications if we implement plugin for CUDA or ROCm backends instead of using the SYCL indirection (e.g. for potential cuML, pykeops plugins or even vanilla numba without dpex).

For reference, pytorch allows to explicitly pass some_data.to(device) and model.to(device) and it the data and the model are not on the same device, calling model(some_data) will fail. It's nice because it's explicit but maybe it not convenient in the case of scikit-learn because

@ogrisel ogrisel changed the title Make it possible to specify a given sycl device programmatically Make it possible to specify a given SYCL device programmatically Sep 28, 2022
@fcharras
Copy link
Collaborator

fcharras commented Dec 1, 2022

@ogrisel after #62 which is merged now, it's possible to trigger this switch by using input data of type dpctl.tensor.usm_ndarray or dpnp.ndarray, the compute will happen on the same device where the data is stored (I've seen this behavior referred to as compute follow data). The data can be moved by the user e.g using to_device. The UX might not check all boxes (for the same reasons given in scikit-learn/scikit-learn#25000) but do you think it is enough to close this issue ?

@ogrisel
Copy link
Collaborator Author

ogrisel commented Dec 1, 2022

I think we should keep it open to get a total fix when the data is passed as a host allocated numpy array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants