QMCPACK
CUDADeviceManager Class Reference

CUDA device manager. More...

+ Collaboration diagram for CUDADeviceManager:

Public Member Functions

 CUDADeviceManager (int &default_device_num, int &num_devices, int local_rank, int local_size)
 

Private Attributes

int cuda_default_device_num
 
int cuda_device_count
 

Detailed Description

CUDA device manager.

Definition at line 22 of file CUDADeviceManager.h.

Constructor & Destructor Documentation

◆ CUDADeviceManager()

CUDADeviceManager ( int &  default_device_num,
int &  num_devices,
int  local_rank,
int  local_size 
)

Definition at line 22 of file CUDADeviceManager.cpp.

References qmcplusplus::app_warning(), CUDADeviceManager::cuda_default_device_num, CUDADeviceManager::cuda_device_count, qmcplusplus::cudaErrorCheck(), cudaFree, cudaGetDeviceCount, cudaSetDevice, and qmcplusplus::determineDefaultDeviceNum().

24 {
25  cudaErrorCheck(cudaGetDeviceCount(&cuda_device_count), "cudaGetDeviceCount failed!");
26  if (num_devices == 0)
27  num_devices = cuda_device_count;
28  else if (num_devices != cuda_device_count)
29  throw std::runtime_error("Inconsistent number of CUDA devices with the previous record!");
30  if (cuda_device_count > local_size)
31  app_warning() << "More CUDA devices than the number of MPI ranks. "
32  << "Some devices will be left idle.\n"
33  << "There is potential performance issue with the GPU affinity. "
34  << "Use CUDA_VISIBLE_DEVICE or MPI launcher to expose desired devices.\n";
35  if (num_devices > 0)
36  {
38  if (default_device_num < 0)
39  default_device_num = cuda_default_device_num;
40  else if (default_device_num != cuda_default_device_num)
41  throw std::runtime_error("Inconsistent assigned CUDA devices with the previous record!");
42 
43 #pragma omp parallel
44  {
45  cudaErrorCheck(cudaSetDevice(cuda_default_device_num), "cudaSetDevice failed!");
46  cudaErrorCheck(cudaFree(0), "cudaFree failed!");
47  }
48  }
49 }
std::ostream & app_warning()
Definition: OutputManager.h:69
#define cudaSetDevice
Definition: cuda2hip.h:148
cudaErrorCheck(cudaMemcpyAsync(dev_lu.data(), lu.data(), sizeof(decltype(lu)::value_type) *lu.size(), cudaMemcpyHostToDevice, hstream), "cudaMemcpyAsync failed copying log_values to device")
#define cudaFree
Definition: cuda2hip.h:99
#define cudaGetDeviceCount
Definition: cuda2hip.h:102
int determineDefaultDeviceNum(int num_devices, int rank_id, int num_ranks)
distribute MPI ranks among devices

Member Data Documentation

◆ cuda_default_device_num

int cuda_default_device_num
private

Definition at line 24 of file CUDADeviceManager.h.

Referenced by CUDADeviceManager::CUDADeviceManager().

◆ cuda_device_count

int cuda_device_count
private

Definition at line 25 of file CUDADeviceManager.h.

Referenced by CUDADeviceManager::CUDADeviceManager().


The documentation for this class was generated from the following files: