Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Experimental dpctl support for native_cpu device #2051

Unanswered
ndgrigorian asked this question in Ideas
Discussion options

The oneAPI DPC++ compiler now has experimental support for a "native CPU" device, which treats the host CPU as a "first-class citizen."
This discussion is meant both to explore the use of native_cpu devices, and to provide convenient instructions on how to start with dpctl and native_cpu targets.

OS: Ubuntu 24.04 Noble
CPU: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz

Initial step
I first created a Conda environment containing the requirements for building dpctl (see documentation)

Setting up the compiler
I first cloned the DPC++ compiler from the Github repo for the oneAPI DPC++ compiler. With my local copy, I read through the documentation and (after one failed experiment) found that I could successfully build the compiler and begin to build dpctl, using the following

from the repo root, I ran

python buildbot/configure.py --native-cpu --llvm-external-projects="lld" 

having found that without --llvm-external-projects="lld" dpctl would fail to build citing lld as being at fault.

After configuring, I ran

python buildbot/compile.py

this took quite awhile, but it did succeed, creating in /path/to/repo/llvm/build/install the built compiler, with clang and clang++ in bin. I also verified that the UR adapter for native_cpu was present in lib.

Building dpctl
With the compiler built, I then set up the environment similarly to building dpctl with the nightly compiler, getting other dependencies

tar xf sycl_linux.tar.gz -C dpcpp
mkdir oclcpuexp
wget https://github.com/intel/llvm/releases/download/2024-WW43/oclcpuexp-20241810.0.08_rel.tar.gz
tar xf oclcpuexp-20241810.0.08_rel.tar.gz -C ./oclcpuexp
wget https://github.com/oneapi-src/oneTBB/releases/download/v2021.12.0/oneapi-tbb-2021120-lin.tgz
tar xf oneapi-tbb-2021120-lin.tgz
cp oclcpuexp/x64/libOpenCL.so* lib/

then set up LD_LIBRARY_PATH and PATH, similar to nightly builds

cat << 'EOF' > set_allvars.sh
 #!/usr/bin/bash
export COMPILER_ROOT_DIR=/path/to/compiler/llvm/build/install
export PATH=${COMPILER_ROOT_DIR}/bin:${PATH}
export LD_LIBRARY_PATH=${COMPILER_ROOT_DIR}/lib:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=${COMPILER_ROOT_DIR}/oclcpuexp/x64:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=${COMPILER_ROOT_DIR}/oneapi-tbb-2021120/lib/intel64/gcc4.8:${LD_LIBRARY_PATH}
export OCL_ICD_VENDORS=
export OCL_ICD_FILENAMES=libintelocl.so
EOF
chmod +x set_allvars.sh
cat set_allvars.sh

I then ran this to verify sycl-ls showed native_cpu device

[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Graphics [0x9a49] 12.0.0 [1.3.29735+27]
[native_cpu:cpu][native_cpu:0] SYCL_NATIVE_CPU, SYCL Native CPU 0.1 [0.0.0]
[opencl:gpu][opencl:0] Intel(R) OpenCL Graphics, Intel(R) Graphics [0x9a49] OpenCL 3.0 NEO [24.39.31294]

and it worked!

Now I ran

python scripts/build_locally.py --verbose --compiler-root ${COMPILER_ROOT_DIR} --c-compiler ${COMPILER_ROOT_DIR}/bin/clang --cxx-compiler ${COMPILER_ROOT_DIR}/bin/clang++ --cmake-opts="-DDPCTL_SYCL_TARGETS=native_cpu"

...this worked, to a point. The _tensor_linalg sub-module failed to build and, after it, the _tensor_sorting sub-module. I commented both of these out.

This eventually succeeded, though warnings were thrown for a significant amount of math functions, like some trig functions, log1pf, etc.

After this, it was possible to import dpctl and run dpctl.lsplatform(2) and see SYCL_NATIVE_CPU as a platform

In [1]: dpctl
In [2]: dpctl.lsplatform(2)
Platform 0 ::
 Name Intel(R) oneAPI Unified Runtime over Level-Zero
 Version 1.3
 Vendor Intel(R) Corporation
 Backend ext_oneapi_level_zero
 Num Devices 1
 # 0
 Name Intel(R) Graphics [0x9a49]
 Version 1.3.29735+27
 Filter string level_zero:gpu:0
Platform 1 ::
 Name SYCL_NATIVE_CPU
 Version 0.1
 Vendor tbd
 Backend ext_oneapi_native_cpu
 Num Devices 1
 # 0
 Name SYCL Native CPU
 Version 0.0.0
 Filter string unknown:cpu:0
Platform 2 ::
 Name Intel(R) OpenCL Graphics
 Version OpenCL 3.0
 Vendor Intel(R) Corporation
 Backend opencl
 Num Devices 1
 # 0
 Name Intel(R) Graphics [0x9a49]
 Version 24.39.31294
 Filter string opencl:gpu:0

Implementing native_cpu in dpctl
dpctl.get_devices() would ignore the native_cpu device because it hadn't been hooked up in dpctl's machinery. So I made adjustments to enable it

In [1]: import dpctl.tensor as dpt, dpctl, numpy as np
In [2]: dpctl.get_devices()
Out[2]:
[<dpctl.SyclDevice [backend_type.level_zero, device_type.gpu, Intel(R) Graphics [0x9a49]] at 0x7fe6835e9b70>,
 <dpctl.SyclDevice [backend_type.native_cpu, device_type.cpu, SYCL Native CPU] at 0x7fe6835ea530>,
 <dpctl.SyclDevice [backend_type.opencl, device_type.gpu, Intel(R) Graphics [0x9a49]] at 0x7fe6835ea3b0>]
In [3]: x = dpt.arange(10**6, device="cpu")
In [4]: x
Out[4]: usm_ndarray([ 0, 1, 2, ..., 999997, 999998, 999999])
In [5]: x.sycl_device
Out[5]: <dpctl.SyclDevice [backend_type.native_cpu, device_type.cpu, SYCL Native CPU] at 0x7fe682f43730>

And it's a success! Some kernels can even be run with it

In [1]: import dpctl.tensor as dpt, dpctl, numpy as np
In [2]: x = dpt.arange(10**6, device="cpu")
In [3]: y = dpt.ones(10**6, dtype=x.dtype, device="cpu")
In [4]: r = x + y
In [5]: r
Out[5]: usm_ndarray([ 1, 2, 3, ..., 999998, 999999, 1000000])
In [6]: r.sycl_device
Out[6]: <dpctl.SyclDevice [backend_type.native_cpu, device_type.cpu, SYCL Native CPU] at 0x7f2c1fd38370>

Public branch
To experiment with this, the branch experimental/support-native-cpu-device has been made available, commenting out the failing sub-modules and implementing native_cpu in the machinery.

You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet
1 participant

AltStyle によって変換されたページ (->オリジナル) /