Code autogeneration =================== The xsuite library uses code autogeneration to specialize kernel code for the different contexts. Three contexts are presently available: ``CPU``, ``CUDA``, and ``OpenCL``. The developer writes a single C source code using the portability macros provided by ``xobjects/headers/common.h``. The preferred macro API includes ``GPUFUN`` for functions callable on the GPU device, ``GPUKERN`` for kernels, ``GPUGLMEM`` for pointers to GPU global memory, ``RESTRICT`` for restrict qualifiers, and ``VECTORIZE_OVER`` / ``END_VECTORIZE`` for context-dependent loops. Older sources may still use the comment strings described below; these legacy annotations are kept for compatibility but should not be used in new handwritten C code. With macros, typos are caught by the compiler instead of being silently ignored as unknown comments. ``VECTORIZE_OVER`` block ~~~~~~~~~~~~~~~~~~~~~~~~ The preferred macro syntax is the following: .. code-block:: C VECTORIZE_OVER(myvar, myvarlim); [MY CODE] END_VECTORIZE; This is translated into a for loop in the CPU implementation and into a single-particle block in the parallel implementations (cupy, pyopencl). Older sources may use the legacy ``//vectorize_over`` and ``//end_vectorize`` comments for the same purpose. The corresponding CPU code will be: .. code-block:: C for (int myvar=0; myvar #endif and OpenMP-specific code can be written as: .. code-block:: C #ifdef XO_CONTEXT_CPU_OPENMP #pragma omp parallel for #endif Older sources may still use the legacy ``//only_for_context`` directive. New handwritten C code should use the ``XO_CONTEXT_*`` macros instead so that the code is more readable and so that the typos are caught by the compiler. ``GPUFUN`` directive ~~~~~~~~~~~~~~~~~~~~ ``GPUFUN`` marks a C function that can be called from the kernel code. On CUDA it expands to ``__device__``; on CPU it expands to ``static inline``. Use it for helper functions and element tracking functions that need to work across CPU and GPU contexts. Legacy C sources can use the ``/*gpufun*/`` directive for the same purpose. New code should include ``xobjects/headers/common.h`` and use the ``GPUFUN`` macro instead. ``GPUKERN`` directive ~~~~~~~~~~~~~~~~~~~~~ ``GPUKERN`` marks an entry-point kernel function launched by an Xobjects context. On CUDA it expands to ``__global__``; on OpenCL it expands to ``__kernel``; on CPU it is empty. Legacy C sources can use the ``/*gpukern*/`` directive for the same purpose. New code should include ``xobjects/headers/common.h`` and use the ``GPUKERN`` macro instead. ``GPUGLMEM`` directive ~~~~~~~~~~~~~~~~~~~~~~~ ``GPUGLMEM`` marks a pointer as referring to global memory in GPU contexts. It expands to ``__global`` on OpenCL and is empty on CUDA and CPU. Legacy C sources can use the ``/*gpuglmem*/`` directive for the same purpose. New code should include ``xobjects/headers/common.h`` and use the ``GPUGLMEM`` macro instead.