Support for multiple devices was officially added in OpenCL 1.1. Among other things, this allows for e.g. using all CPUs in a multi-socket CPU mainboard as a single OpenCL compute device. Nevertheless, the efficient use of multiple OpenCL devices is far from trivial, because algorithms have to be designed such that they take distributed memory and synchronization issues into account.
Unless specified otherwise (see User-Provided OpenCL Contexts), ViennaCL silently creates its own context and adds all available default devices with a single queue per device to it. All operations are then carried out on this (or the user-provided) context, which can be obtained with the call
This default context is identified by the ID 0
(of type long
). ViennaCL uses the first platform returned by the OpenCL backend for the context. If a different platform should be used on a machine with multiple platforms available, this can be achieved with
where the context ID is id
and platform_index
refers to the array index of the platform as returned by clGetPlatformIDs()
.
By default, only the first device in the context is used for all operations. This device can be obtained via
A user may wish to use multiple OpenCL contexts, where each context consists of a subset of the available devices. To setup a context with ID id
with a particular device type only, the user has to specify this prior to any other ViennaCL related statements:
Instead of using the tag classes, the respective OpenCL constants CL_DEVICE_TYPE_GPU
etc. can be supplied as second argument.
Another possibility is to query all devices from the current platform:
and create a custom subset of devices, which is then passed to the context setup routine:
Similarly, contexts with other IDs can be set up.
The user is reminded that memory objects within an OpenCL context are allocated for all devices within a context. Thus, setting up contexts with one device each is optimal in terms of memory usage, because each memory object is then bound to a single device only. However, memory transfer between contexts (and thus devices) has to be done manually by the library user then. Moreover, the user has to keep track in which context the individual ViennaCL objects have been created, because all operands are assumed to be in the currently active context.
ViennaCL always uses the currently active OpenCL context with the currently active device to enqueue compute kernels. The default context is identified by ID 0
. The context with ID id
can be set as active context with the line.
Subsequent kernels are then enqueued on the active device for that particular context.
Similar to setting contexts active, the active device can be set for each context. For example, setting the second device in the context to be the active device, the lines
are required. In some circumstances one may want to pass the device object directly, e.g. to set the second device of the platform active:
If the supplied device is not part of the context, an error message is printed and the active device remains unchanged.
Each OpenCL context provides a member function .build_options()
, which can be used to pass OpenCL compiler flags prior to compilation. Flags need to be passed to the context prior to the compilation of the respective kernels, i.e. prior the first instantiation of the respective matrix or vector types.
To pass the -cl-mad-enable
flag to the current context, the line
is sufficient. Confer to the OpenCL standard for a full list of flags.