Make it possible to specify a given SYCL device programmatically #21

ogrisel · 2022-09-28T09:30:57Z

At the moment the only way to select a given device (cpu, gpu, with level zero or OpenCL or default host device) is via the SYCL_DEVICE_FILTER environment variable. It would be convenient to be able to select a device programmatically (from within a running Python program, e.g. in a notebook) instead.

However I am not sure if and how we should extend the engine API to allow for this. One way to do that would be to do that would be to allow engine provider names to include an extra string spec to select the device:

with config_context(engine_provider="sklearn_numba_dpex:opencl:gpu:0"):
    model.fit(X_train, y_train)

alternatively we could allow for a separate config param named engine_device for instance, in which case we could directly accept dpctl device instance for sycl engine providers.

device = dpctl.SyclDevice("level_zero:gpu:2")
with config_context(engine_provider="sklearn_numba_dpex", engine_device=device):
    model.fit(X_train, y_train)

A similar problem will happen with different device specifications if we implement plugin for CUDA or ROCm backends instead of using the SYCL indirection (e.g. for potential cuML, pykeops plugins or even vanilla numba without dpex).

For reference, pytorch allows to explicitly pass some_data.to(device) and model.to(device) and it the data and the model are not on the same device, calling model(some_data) will fail. It's nice because it's explicit but maybe it not convenient in the case of scikit-learn because

The text was updated successfully, but these errors were encountered:

fcharras · 2022-12-01T16:06:05Z

@ogrisel after #62 which is merged now, it's possible to trigger this switch by using input data of type dpctl.tensor.usm_ndarray or dpnp.ndarray, the compute will happen on the same device where the data is stored (I've seen this behavior referred to as compute follow data). The data can be moved by the user e.g using to_device. The UX might not check all boxes (for the same reasons given in scikit-learn/scikit-learn#25000) but do you think it is enough to close this issue ?

ogrisel · 2022-12-01T16:46:19Z

I think we should keep it open to get a total fix when the data is passed as a host allocated numpy array.

ogrisel changed the title ~~Make it possible to specify a given sycl device programmatically~~ Make it possible to specify a given SYCL device programmatically Sep 28, 2022

ogrisel mentioned this issue Sep 28, 2022

High-level API for registering the engine with scikit-learn's entry point #20

Merged

fcharras mentioned this issue Nov 25, 2022

ENH: Store fitted attributes as dpt and return dpt tensors #65

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it possible to specify a given SYCL device programmatically #21

Make it possible to specify a given SYCL device programmatically #21

ogrisel commented Sep 28, 2022 •

edited

Loading

fcharras commented Dec 1, 2022 •

edited

Loading

ogrisel commented Dec 1, 2022

Make it possible to specify a given SYCL device programmatically #21

Make it possible to specify a given SYCL device programmatically #21

Comments

ogrisel commented Sep 28, 2022 • edited Loading

fcharras commented Dec 1, 2022 • edited Loading

ogrisel commented Dec 1, 2022

ogrisel commented Sep 28, 2022 •

edited

Loading

fcharras commented Dec 1, 2022 •

edited

Loading