@@ -159,13 +159,41 @@ The following plugins from CodePlay are supported:
159
159
.. _codeplay_nv_plugin : https://developer.codeplay.com/products/oneapi/nvidia/
160
160
.. _codeplay_amd_plugin : https://developer.codeplay.com/products/oneapi/amd/
161
161
162
- ``dpctl `` can be built for CUDA devices as follows:
162
+ Builds for CUDA and AMD devices internally use SYCL alias targets that are passed to the compiler.
163
+ A full list of available SYCL alias targets is available in the
164
+ `DPC++ Compiler User Manual <https://intel.github.io/llvm/UsersManual.html >`_.
165
+
166
+ CUDA build
167
+ ~~~~~~~~~~
168
+
169
+ ``dpctl `` can be built for CUDA devices using the ``DPCTL_TARGET_CUDA `` CMake option,
170
+ which accepts a specific compute architecture string:
171
+
172
+ .. code-block :: bash
173
+
174
+ python scripts/build_locally.py --verbose --cmake-opts=" -DDPCTL_TARGET_CUDA=sm_80"
175
+
176
+ To use the default architecture (``sm_50 ``),
177
+ set ``DPCTL_TARGET_CUDA `` to a value such as ``ON ``, ``TRUE ``, ``YES ``, ``Y ``, or ``1 ``:
163
178
164
179
.. code-block :: bash
165
180
166
181
python scripts/build_locally.py --verbose --cmake-opts=" -DDPCTL_TARGET_CUDA=ON"
167
182
168
- And for AMD devices
183
+ Note that kernels are built for the default architecture (``sm_50 ``), allowing them to work on a
184
+ wider range of architectures, but limiting the usage of more recent CUDA features.
185
+
186
+ For reference, compute architecture strings like ``sm_80 `` correspond to specific
187
+ CUDA Compute Capabilities (e.g., Compute Capability 8.0 corresponds to ``sm_80 ``).
188
+ A complete mapping between NVIDIA GPU models and their respective
189
+ Compute Capabilities can be found in the official
190
+ `CUDA GPU Compute Capability <https://developer.nvidia.com/cuda-gpus >`_ documentation.
191
+
192
+ AMD build
193
+ ~~~~~~~~~
194
+
195
+ ``dpctl `` can be built for AMD devices using the ``DPCTL_TARGET_HIP `` CMake option,
196
+ which requires specifying a compute architecture string:
169
197
170
198
.. code-block :: bash
171
199
@@ -174,8 +202,13 @@ And for AMD devices
174
202
Note that the `oneAPI for AMD GPUs ` plugin requires the architecture be specified and only
175
203
one architecture can be specified at a time.
176
204
177
- It is, however, possible to build for Intel devices, CUDA devices, and an AMD device
178
- architecture all at once:
205
+ Multi-target build
206
+ ~~~~~~~~~~~~~~~~~~
207
+
208
+ The default ``dpctl `` build from the source enables support of Intel devices only.
209
+ Extending the build with a custom SYCL target additionally enables support of CUDA or AMD
210
+ device in ``dpctl ``. Besides, the support can be also extended to enable both CUDA and AMD
211
+ devices at the same time:
179
212
180
213
.. code-block :: bash
181
214
0 commit comments