This module provides C-style memory managmenent functions. Its purpose is not to become a fully feature container library. It is to provide portable malloc, memcpy and free functions with a little helpers to copy data from and to the devices.
Note that the below functions simply wraps the corresponding C functions when targeting a CPU.
template <typename T> T *device_malloc(size_t sz)
Allocates sz * sizeof(T)
bytes of memory on the device.
On error NULL is returned.
template <typename T> T *device_calloc(size_t sz)
Allocates sz * sizeof(T)
bytes of memory on the device and set the
allocated memory to zero.
On error NULL is returned.
template <typename T> void device_free(T *ptr)
Free the memory pointed to by the given pointer.
template <typename T> void copy_to_device(T *device_ptr, T *host_ptr,
size_t sz)
Copy data to from host to device.
template <typename T> void copy_to_host(T *host_ptr, T *device_ptr,
size_t sz)
Copy data to from device to host.
#define nsimd_fill_dev_mem_func(func_name, expr)
Create a device function that will fill data with expr
. To call the created
function one simply does func_name(ptr, sz)
. The expr
argument represents
some simple C++ expression that can depend only on i
the i-th element in
the vector as shown in the example below.
nsimd_fill_dev_mem_func(prng, ((i * 1103515245 + 12345) / 65536) % 32768)
int main() {
prng(ptr, 1000);
return 0;
}
It is often useful to allocate a pair of data buffers: one on the host and one on the devices to perform data transfers. The below functions provides quick ways to malloc, calloc, free and memcpy pointers on host and devices at once. Note that when targeting CPUs the pair of pointers is reduced to one pointer that ponit the a single data buffer in which case memcpy's are not performed. Note also that there is no implicit synchronization of data between both data buffers. It is up to the programmer to triggers memcpy's.
template <typename T>
struct paired_pointers_t {
T *device_ptr, *host_ptr;
size_t sz;
};
Members of the above structure are not to be modified but can be passed as arguments for reading/writing data from/to memory they point to.
template <typename T> paired_pointers_t<T> pair_malloc(size_t sz)
Allocate sz * sizeof(T)
bytes of memory on the host and on the device.
If an error occurs both pointers are NULL.
template <typename T> paired_pointers_t<T> pair_malloc_or_exit(size_t
sz)
Allocate sz * sizeof(T)
bytes of memory on the host and on the device.
If an error occurs, prints an error message on stderr and exit(3).
template <typename T> paired_pointers_t<T> pair_calloc(size_t sz)
Allocate sz * sizeof(T)
bytes of memory on the host and on the device.
Write both data buffers with zeros.
If an error occurs both pointers are NULL.
template <typename T> paired_pointers_t<T> pair_calloc_or_exit(size_t
sz)
Allocate sz * sizeof(T)
bytes of memory on the host and on the device.
Write both data buffers with zeros.
If an error occurs, prints an error message on stderr and exit(3).
template <typename T> void pair_free(paired_pointers_t<T> p)
Free data buffers on the host and the device.
template <typename T> void copy_to_device(paired_pointers_t<T> p)
Copy data from the host buffer to its corresponding device buffer.
template <typename T> void copy_to_host(paired_pointers_t<T> p)
Copy data from the device buffer to its corresponding host buffer.