Reference
- class xobjects.ContextCpu(omp_num_threads=0)
Creates a CPU Platform object, that allows performing the computations on conventional CPUs.
- Returns:
platform object.
- Return type:
Create a new CPU context, serial or with parallelization using OpenMP. :param omp_num_threads: Number of threads to be :type omp_num_threads: int | Literal[‘auto’] :param used by OpenMP. If 0: :param no parallelization is used. If ‘auto’: :param the: :param number of threads is selected automatically by OpenMP.:
- property nplike_array_type
- property linked_array_type
- add_kernels(sources=None, kernels=None, specialize=True, apply_to_source=(), save_source_as=None, extra_compile_args: Sequence[str] = ('-O3', '-Wno-unused-function'), extra_link_args: Sequence[str] = ('-O3',), extra_cdef='', extra_classes=(), extra_headers=(), compile=True)
Adds user-defined kernels to the context. The kernel source code is provided as a string and/or in source files and must contain the kernel names defined in the kernel descriptions. :param sources: List of source codes that are concatenated before
compilation. The list can contain strings (raw source code), File objects and Path objects.
- Parameters:
kernels (dict) – Dictionary with the kernel descriptions in the form given by the following examples. The descriptions define the kernel names, the type and name of the arguments and identify one input argument that defines the number of threads to be launched (only on cuda/opencl).
specialize (bool) – If True, the code is specialized using annotations in the source code. Default is
True
apply_to_source (List[Callable]) – functions to be applied to source
save_source_as (str) – Filename for saving the specialized source code. Default is
`None`
.extra_compile_args – Extra arguments to be passed to the compiler.
extra_link_args – Extra arguments to be passed to the linker.
extra_cdef – Extra C definitions to be passed to cffi.
extra_classes – Extra xobjects classes whose API is needed.
extra_headers – Extra headers to be added to the source code.
compile – If True, the source code is compiled. Default is
True
. Otherwise, a dummy kernel is returned, with the source code attached.
Example:
# A simple kernel src_code = ''' /*gpukern*/ void my_mul(const int n, /*gpuglmem*/ const double* x1, /*gpuglmem*/ const double* x2, /*gpuglmem*/ double* y) { int tid = 0 //vectorize_over tid y[tid] = x1[tid] * x2[tid]; //end_vectorize } ''' # Prepare description kernel_descriptions = { "my_mul": xo.Kernel( args=[ xo.Arg(xo.Int32, name="n"), xo.Arg(xo.Float64, pointer=True, const=True, name="x1"), xo.Arg(xo.Float64, pointer=True, const=True, name="x2"), xo.Arg(xo.Float64, pointer=True, const=False, name="y"), ], n_threads="n", ), } # Import kernel in context ctx.add_kernels( sources=[src_code], kernels=kernel_descriptions, save_source_as=None, ) # With a1, a2, b being arrays on the context, the kernel # can be called as follows: ctx.kernels.my_mul(n=len(a1), x1=a1, x2=a2, y=b)
- build_kernels(kernel_descriptions: Dict[str, Kernel], module_name: str = None, containing_dir='.', sources=None, specialize=True, apply_to_source=(), save_source_as=None, extra_compile_args=('-O3', '-Wno-unused-function'), extra_link_args=('-O3',), extra_cdef='', extra_classes=(), extra_headers=(), compile=True) Dict[Tuple[str, tuple], KernelCpu]
- kernels_from_file(module_name: str, kernel_descriptions: Dict[str, Kernel], containing_dir='.') Dict[Tuple[str, tuple], KernelCpu]
Import a compiled module module_name located in containing_dir (by default it is the current working directory), and add the kernels from the module, as defined in kernel_descriptions, to the context. Returns the path to the loaded so file.
- compile_kernel(module_name, kernel_descriptions, cdefs, specialized_source, extra_compile_args, extra_link_args, containing_dir='.') Path
- static cffi_module_for_c_types(c_types, containing_dir='.')
- nparray_to_context_array(arr)
Moves a numpy array to the device memory. No action is performed by this function in the CPU context. The method is provided so that the CPU context has an identical API to the GPU ones.
- Parameters:
arr (numpy.ndarray) – Array to be transferred
- Returns:
The same array (no copy!).
- Return type:
numpy.ndarray
- nparray_from_context_array(dev_arr)
Moves an array to the device to a numpy array. No action is performed by this function in the CPU context. The method is provided so that the CPU context has an identical API to the GPU ones.
- Parameters:
dev_arr (numpy.ndarray) – Array to be transferred
- Returns:
The same array (no copy!)
- Return type:
numpy.ndarray
- property nplike_lib
Module containing all the numpy features. Numpy members should be accessed through
nplike_lib
to keep compatibility with the other contexts.
- property splike_lib
Module containing all the scipy features. Numpy members should be accessed through
splike_lib
to keep compatibility with the other contexts.
- synchronize()
Ensures that all computations submitted to the context are completed. No action is performed by this function in the CPU context. The method is provided so that the CPU context has an identical API to the GPU ones.
- zeros(*args, **kwargs)
Allocates an array of zeros on the device. The function has the same interface of numpy.zeros
- plan_FFT(data, axes)
Generate an FFT plan object to be executed on the context.
- Parameters:
data (numpy.ndarray) – Array having type and shape for which the FFT needs to be planned.
axes (sequence of ints) – Axes along which the FFT needs to be performed.
- Returns:
FFT plan for the required array shape, type and axes.
- Return type:
FFTCpu
Example:
plan = context.plan_FFT(data, axes=(0,1)) data2 = 2*data # Forward tranform (in place) plan.transform(data2) # Inverse tranform (in place) plan.itransform(data2)
- property kernels
Dictionary containing all the kernels that have been imported to the context. The syntax
context.kernels.mykernel
can also be used.
- property openmp_enabled