Contexts

Xsuite supports different plaforms allowing the exploitation of different kinds of hardware (CPUs and GPUs). A context is initialized by instanciating objects from one of the context classes available Xobjects, which is then passed to the other Xsuite components (see example in Getting Started Guide). Contexts are interchangeable as they expose the same API. Custom kernel functions can be added to the contexts. General source code with annotations can be provided to define the kernels, which is then automatically specialized for the chosen platform (see dedicated section).

Three contexts are presently available:

The Cupy context, based on cupy-cuda to run on NVidia GPUs

The Pyopencl context, bases on PyOpenCL, to run on CPUs or GPUs throught PyOPENCL library.

The CPU context, to use conventional CPUs

The corresponfig API is described in the following subsections.

Cupy context

class xobjects.ContextCupy(default_block_size=256, default_shared_mem_size_bytes=0, device=None, backend: Literal[None, 'nvrtc', 'clang'] = None)

Creates a Cupy Context object, that allows performing the computations on nVidia GPUs.

The module-level flag xobjects.context_cupy.no_fast_compile controls whether NVRTC fast compile tuning is disabled. By default it is False, so CUDA kernels built with NVRTC >= 12.9 use --Ofast-compile=min to reduce compilation time and memory usage, at the cost of some runtime performance. Set it to True to disable this option. The environment variable XO_CUDA_NO_FAST_COMPILE also disables it.

Parameters:

default_block_size (int) – CUDA thread size that is used by default for kernel execution in case a block size is not specified directly in the kernel object. The default value is 256.
device (int) – Identifier of the device to be used by the context.

Returns:

context object.

Return type:

ContextCupy

property nplike_array_type

property linked_array_type

build_kernels(sources, kernel_descriptions, specialize=True, apply_to_source=(), save_source_as=None, extra_compile_args=(), extra_cdef=None, extra_classes=(), extra_headers=(), compile=True) → Dict[Tuple[str, tuple], KernelCupy]

nparray_to_context_array(arr, copy=False)

Copies a numpy array to the device memory.

Parameters:

arr (numpy.ndarray) – Array to be transferred
copy (bool) – This parameter is ignored for CUDA, as the data lives on a different device.

Returns:

The same array copied to the device.

Return type:

cupy.ndarray

nparray_from_context_array(dev_arr, copy=False)

Copies an array to the device to a numpy array.

Parameters:

dev_arr (cupy.ndarray) – Array to be transferred.
copy (bool) – This parameter is ignored for CUDA, as the data lives on a different device.

Returns:

The same data copied to a numpy array.

Return type:

numpy.ndarray

property nplike_lib: Module containing all the numpy features supported by cupy.

property splike_lib: Module containing all the scipy features supported by cupy.

synchronize(): Ensures that all computations submitted to the context are completed. Equivalent to cupy.cuda.stream.get_current_stream().synchronize()

zeros(*args, **kwargs): Allocates an array of zeros on the device. The function has the same interface of numpy.zeros

plan_FFT(data, axes)

Generates an FFT plan object to be executed on the context.

Parameters:

data (cupy.ndarray) – Array having type and shape for which the FFT needs to be planned.
axes (sequence of ints) – Axes along which the FFT needs to be performed.

Returns:

FFT plan for the required array shape, type and axes.

Return type:

FFTCupy

Example:

plan = context.plan_FFT(data, axes=(0,1))

data2 = 2*data

# Forward tranform (in place)
plan.transform(data2)

# Inverse tranform (in place)
plan.itransform(data2)

property kernels: Dictionary containing all the kernels that have been imported to the context. The syntax context.kernels.mykernel can also be used.

add_kernels(kernels: dict, sources: list = None, specialize: bool = True, apply_to_source: Sequence[callable] = (), save_source_as: str = None, extra_cdef: Optional[str] = '', extra_classes: Sequence[Type] = (), extra_headers: Sequence[Union[str, Path, io.TextIOBase, Source]] = (), compile: bool = True, extra_compile_args: Sequence[str] = ())

Adds user-defined kernels to the context. The kernel source code is provided as a string and/or in source files and must contain the kernel names defined in the kernel descriptions. :param sources: List of source codes that are concatenated before

compilation. The list can contain strings (raw source code), File objects and Path objects.

Parameters:

kernels (dict) – Dictionary with the kernel descriptions in the form given by the following examples. The descriptions define the kernel names, the type and name of the arguments and identify one input argument that defines the number of threads to be launched (only on cuda/opencl).
specialize (bool) – If True, the code is specialized using annotations in the source code. Default is True
apply_to_source (List[Callable]) – functions to be applied to source
save_source_as (str) – Filename for saving the specialized source code. Default is `None`.
extra_cdef – Extra C definitions to be passed to cffi.
extra_classes – Extra xobjects classes whose API is needed.
extra_headers – Extra headers to be added to the source code.
compile – If True, the source code is compiled. Default is True. Otherwise, a dummy kernel is returned, with the source code attached.

Example:

# A simple kernel
src_code = '''
/*gpukern*/
void my_mul(const int n,
    /*gpuglmem*/ const double* x1,
    /*gpuglmem*/ const double* x2,
    /*gpuglmem*/       double* y) {
    int tid = 0 //vectorize_over tid
    y[tid] = x1[tid] * x2[tid];
    //end_vectorize
    }
'''

# Prepare description
kernel_descriptions = {
    "my_mul": xo.Kernel(
        args=[
            xo.Arg(xo.Int32, name="n"),
            xo.Arg(xo.Float64, pointer=True, const=True, name="x1"),
            xo.Arg(xo.Float64, pointer=True, const=True, name="x2"),
            xo.Arg(xo.Float64, pointer=True, const=False, name="y"),
        ],
        n_threads="n",
        ),
}

# Import kernel in context
ctx.add_kernels(
    sources=[src_code],
    kernels=kernel_descriptions,
    save_source_as=None,
)

# With a1, a2, b being arrays on the context, the kernel
# can be called as follows:
ctx.kernels.my_mul(n=len(a1), x1=a1, x2=a2, y=b)

allow_prebuilt_kernels = False

property buffers

get_installed_c_source_and_library_paths() → tuple[set[pathlib.Path], set[str], set[pathlib.Path]]

Returns a list of C paths registered in dependent packages.

In a package that depends on xobjects, you can register C source and library paths using the entry point xobjects.build_info. These paths will be added to the C include path and the library path when building kernels. For example, the following will allow to write #include <xcoll/path/to/some/header.h> in kernel sources, and allow to use functions from the library xcoll/lib/libFlukaIO.a:

and in the file xcoll/_xobjects.py:

minimum_alignment = 1

new_buffer(capacity=1048576)

PyOpenCL context

class xobjects.ContextPyopencl(device=None, patch_pyopencl_array=True, minimum_alignment=None)

Creates a Pyopencl Context object, that allows performing the computations on GPUs and CPUs through PyOpenCL.

Parameters:

device (str or Device) – The device (CPU or GPU) for the simulation.
default_kernels (bool) – If True, the Xfields defult kernels are automatically imported.
patch_pyopencl_array (bool) – If True, the PyOpecCL class is patched to allow some operations with non-contiguous arrays.
specialize_code (bool) – If True, the code is specialized using annotations in the source code. Default is True

Returns:

context object.

Return type:

ContextPyopencl

context_cache = {}

property nplike_array_type

property linked_array_type

classmethod get_devices()

classmethod print_devices()

minimum_alignment = 1

find_minimum_alignment()

build_kernels(sources, kernel_descriptions, specialize=True, apply_to_source=(), save_source_as=None, extra_compile_args=(), extra_cdef=None, extra_classes=(), extra_headers=(), compile=True) → Dict[Tuple[str, tuple], KernelPyopencl]

nparray_to_context_array(arr, copy=False)

Copies a numpy array to the device memory.

Parameters:

arr (numpy.ndarray) – Array to be transferred
copy (bool) – This parameter is ignored for OpenCL, as the data lives on a different device.

Returns:

The same array copied to the device.

Return type:

pyopencl.array.Array

nparray_from_context_array(dev_arr, copy=False)

Copies an array to the device to a numpy array.

Parameters:

dev_arr (pyopencl.array.Array) – Array to be transferred.
copy (bool) – This parameter is ignored for OpenCL, as the data lives on a different device.

Returns:

The same data copied to a numpy array.

Return type:

numpy.ndarray

property nplike_lib: Module containing all the numpy features supported by PyOpenCL (optionally with patches to operate with non-contiguous arrays).

property splike_lib: Scipy features are not available through openCL

synchronize(): Ensures that all computations submitted to the context are completed. No action is performed by this function in the Pyopencl context. The method is provided so that the Pyopencl context has an identical API to the Cupy one.

zeros(*args, **kwargs): Allocates an array of zeros on the device. The function has the same interface of numpy.zeros

plan_FFT(data, axes, wait_on_call=True)

Generates an FFT plan object to be executed on the context.

Parameters:

data (pyopencl.array.Array) – Array having type and shape for which the FFT needs to be planned.
axes (sequence of ints) – Axes along which the FFT needs to be performed.

Returns:

FFT plan for the required array shape, type and axes.

Return type:

FFTPyopencl

Example:

plan = context.plan_FFT(data, axes=(0,1))

data2 = 2*data

# Forward tranform (in place)
plan.transform(data2)

# Inverse tranform (in place)
plan.itransform(data2)

property kernels: Dictionary containing all the kernels that have been imported to the context. The syntax context.kernels.mykernel can also be used.

add_kernels(kernels: dict, sources: list = None, specialize: bool = True, apply_to_source: Sequence[callable] = (), save_source_as: str = None, extra_cdef: Optional[str] = '', extra_classes: Sequence[Type] = (), extra_headers: Sequence[Union[str, Path, io.TextIOBase, Source]] = (), compile: bool = True, extra_compile_args: Sequence[str] = ())

Adds user-defined kernels to the context. The kernel source code is provided as a string and/or in source files and must contain the kernel names defined in the kernel descriptions. :param sources: List of source codes that are concatenated before

compilation. The list can contain strings (raw source code), File objects and Path objects.

Parameters:

kernels (dict) – Dictionary with the kernel descriptions in the form given by the following examples. The descriptions define the kernel names, the type and name of the arguments and identify one input argument that defines the number of threads to be launched (only on cuda/opencl).
specialize (bool) – If True, the code is specialized using annotations in the source code. Default is True
apply_to_source (List[Callable]) – functions to be applied to source
save_source_as (str) – Filename for saving the specialized source code. Default is `None`.
extra_cdef – Extra C definitions to be passed to cffi.
extra_classes – Extra xobjects classes whose API is needed.
extra_headers – Extra headers to be added to the source code.
compile – If True, the source code is compiled. Default is True. Otherwise, a dummy kernel is returned, with the source code attached.

Example:

# A simple kernel
src_code = '''
/*gpukern*/
void my_mul(const int n,
    /*gpuglmem*/ const double* x1,
    /*gpuglmem*/ const double* x2,
    /*gpuglmem*/       double* y) {
    int tid = 0 //vectorize_over tid
    y[tid] = x1[tid] * x2[tid];
    //end_vectorize
    }
'''

# Prepare description
kernel_descriptions = {
    "my_mul": xo.Kernel(
        args=[
            xo.Arg(xo.Int32, name="n"),
            xo.Arg(xo.Float64, pointer=True, const=True, name="x1"),
            xo.Arg(xo.Float64, pointer=True, const=True, name="x2"),
            xo.Arg(xo.Float64, pointer=True, const=False, name="y"),
        ],
        n_threads="n",
        ),
}

# Import kernel in context
ctx.add_kernels(
    sources=[src_code],
    kernels=kernel_descriptions,
    save_source_as=None,
)

# With a1, a2, b being arrays on the context, the kernel
# can be called as follows:
ctx.kernels.my_mul(n=len(a1), x1=a1, x2=a2, y=b)

allow_prebuilt_kernels = False

property buffers

get_installed_c_source_and_library_paths() → tuple[set[pathlib.Path], set[str], set[pathlib.Path]]

Returns a list of C paths registered in dependent packages.

In a package that depends on xobjects, you can register C source and library paths using the entry point xobjects.build_info. These paths will be added to the C include path and the library path when building kernels. For example, the following will allow to write #include <xcoll/path/to/some/header.h> in kernel sources, and allow to use functions from the library xcoll/lib/libFlukaIO.a:

and in the file xcoll/_xobjects.py:

new_buffer(capacity=1048576)

CPU context

class xobjects.ContextCpu(omp_num_threads=0)

Creates a CPU Platform object, that allows performing the computations on conventional CPUs.

Returns:: platform object.
Return type:: ContextCpu

Create a new CPU context, serial or with parallelization using OpenMP. :param omp_num_threads: Number of threads to be :type omp_num_threads: int | Literal[‘auto’] :param used by OpenMP. If 0: :param no parallelization is used. If ‘auto’: :param the: :param number of threads is selected automatically by OpenMP.:

property nplike_array_type

property linked_array_type

allow_prebuilt_kernels = False

add_kernels(sources=None, kernels=None, specialize=True, apply_to_source=(), save_source_as=None, extra_compile_args: Sequence[str] = (), extra_link_args: Sequence[str] = (), extra_cdef='', extra_classes=(), extra_headers=(), compile=True)

Adds user-defined kernels to the context. The kernel source code is provided as a string and/or in source files and must contain the kernel names defined in the kernel descriptions. :param sources: List of source codes that are concatenated before

compilation. The list can contain strings (raw source code), File objects and Path objects.

Parameters:

kernels (dict) – Dictionary with the kernel descriptions in the form given by the following examples. The descriptions define the kernel names, the type and name of the arguments and identify one input argument that defines the number of threads to be launched (only on cuda/opencl).
specialize (bool) – If True, the code is specialized using annotations in the source code. Default is True
apply_to_source (List[Callable]) – functions to be applied to source
save_source_as (str) – Filename for saving the specialized source code. Default is `None`.
extra_compile_args – Extra arguments to be passed to the compiler.
extra_link_args – Extra arguments to be passed to the linker.
extra_cdef – Extra C definitions to be passed to cffi.
extra_classes – Extra xobjects classes whose API is needed.
extra_headers – Extra headers to be added to the source code.
compile – If True, the source code is compiled. Default is True. Otherwise, a dummy kernel is returned, with the source code attached.

Example:

# A simple kernel
src_code = '''
/*gpukern*/
void my_mul(const int n,
    /*gpuglmem*/ const double* x1,
    /*gpuglmem*/ const double* x2,
    /*gpuglmem*/       double* y) {
    int tid = 0 //vectorize_over tid
    y[tid] = x1[tid] * x2[tid];
    //end_vectorize
    }
'''

# Prepare description
kernel_descriptions = {
    "my_mul": xo.Kernel(
        args=[
            xo.Arg(xo.Int32, name="n"),
            xo.Arg(xo.Float64, pointer=True, const=True, name="x1"),
            xo.Arg(xo.Float64, pointer=True, const=True, name="x2"),
            xo.Arg(xo.Float64, pointer=True, const=False, name="y"),
        ],
        n_threads="n",
        ),
}

# Import kernel in context
ctx.add_kernels(
    sources=[src_code],
    kernels=kernel_descriptions,
    save_source_as=None,
)

# With a1, a2, b being arrays on the context, the kernel
# can be called as follows:
ctx.kernels.my_mul(n=len(a1), x1=a1, x2=a2, y=b)

build_kernels(kernel_descriptions: Dict[str, Kernel], module_name: str = None, containing_dir='.', sources=None, specialize=True, apply_to_source=(), save_source_as=None, extra_compile_args=(), extra_link_args=(), extra_cdef='', extra_classes=(), extra_headers=(), compile=True) → Dict[Tuple[str, tuple], KernelCpu]

kernels_from_file(module_name: str, kernel_descriptions: Dict[str, Kernel], containing_dir='.') → Dict[Tuple[str, tuple], KernelCpu]: Import a compiled module module_name located in containing_dir (by default it is the current working directory), and add the kernels from the module, as defined in kernel_descriptions, to the context. Returns the path to the loaded so file.

compile_kernel(module_name, kernel_descriptions, cdefs, specialized_source, extra_compile_args, extra_link_args, containing_dir='.') → Path

static cffi_module_for_c_types(c_types, containing_dir='.')

nparray_to_context_array(arr, copy=False)

Moves a numpy array to the device memory. No action is performed by this function in the CPU context. The method is provided so that the CPU context has an identical API to the GPU ones.

Parameters:

arr (numpy.ndarray) – Array to be transferred
copy (bool) – If True, a copy of the array is made.

Returns:

Numpy array with the same data, original or a copy.

Return type:

numpy.ndarray

nparray_from_context_array(dev_arr, copy=False)

Moves an array to the device to a numpy array. No action is performed by this function in the CPU context. The method is provided so that the CPU context has an identical API to the GPU ones.

Parameters:

dev_arr (numpy.ndarray) – Array to be transferred
copy (bool) – If True, a copy of the array is made.

Returns:

Numpy array with the same data, original or a copy.

Return type:

numpy.ndarray

property nplike_lib: Module containing all the numpy features. Numpy members should be accessed through nplike_lib to keep compatibility with the other contexts.

property splike_lib: Module containing all the scipy features. Numpy members should be accessed through splike_lib to keep compatibility with the other contexts.

synchronize(): Ensures that all computations submitted to the context are completed. No action is performed by this function in the CPU context. The method is provided so that the CPU context has an identical API to the GPU ones.

zeros(*args, **kwargs): Allocates an array of zeros on the device. The function has the same interface of numpy.zeros

plan_FFT(data, axes)

Generate an FFT plan object to be executed on the context.

Parameters:

data (numpy.ndarray) – Array having type and shape for which the FFT needs to be planned.
axes (sequence of ints) – Axes along which the FFT needs to be performed.

Returns:

FFT plan for the required array shape, type and axes.

Return type:

FFTCpu

Example:

plan = context.plan_FFT(data, axes=(0,1))

data2 = 2*data

# Forward tranform (in place)
plan.transform(data2)

# Inverse tranform (in place)
plan.itransform(data2)

property kernels: Dictionary containing all the kernels that have been imported to the context. The syntax context.kernels.mykernel can also be used.

property openmp_enabled

property buffers

get_installed_c_source_and_library_paths() → tuple[set[pathlib.Path], set[str], set[pathlib.Path]]

Returns a list of C paths registered in dependent packages.

In a package that depends on xobjects, you can register C source and library paths using the entry point xobjects.build_info. These paths will be added to the C include path and the library path when building kernels. For example, the following will allow to write #include <xcoll/path/to/some/header.h> in kernel sources, and allow to use functions from the library xcoll/lib/libFlukaIO.a:

and in the file xcoll/_xobjects.py:

minimum_alignment = 1

new_buffer(capacity=1048576)