2022年 11月 16日

Python使用CUDA

简单的numba+CUDA测试

  1. import numpy as np
  2. from timeit import default_timer as timer
  3. from numba import vectorize
  4. @vectorize(['float32(float32, float32)'], target='cuda')
  5. def vector_add(a, b):
  6. return a + b
  7. def main():
  8. n = 3200000
  9. a = np.ones(n, dtype=np.float32)
  10. b = np.ones(n, dtype=np.float32)
  11. # c = np.ones(n, dtype=np.float32)
  12. start = timer()
  13. c = vector_add(a, b)
  14. vector_add_time = timer() - start
  15. print(c[:5])
  16. print(c[-5:])
  17. print(vector_add_time)
  18. if __name__ == '__main__':
  19. main()

报错:

numba.uda.cudadrv.error.NvvmSupportError: libNVVM cannot be found. Do `conda install cudatoolkit`

找了网上添加环境变量的方法无果。

之后手动添加cudatoolkit链接:

  1. import os
  2. os.environ['NUMBAPRO_NVVM'] = r'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\nvvm\bin\nvvm64_32_0.dll'
  3. os.environ['NUMBAPRO_LIBDEVICE'] = r'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\nvvm\libdevice'

这里有个tips:

要将C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\nvvm\libdevice中的libdevice..10.bc重命名为libdevice.compute_52.10.bc。(其中compute_52)为显卡的算力大小。不然仍然会报错。