NumbaPro is a powerful compiler that takes high-level Python code directly to the GPU producing fast-code that is the equivalent of programming in a lower-level language. It contains an implementation of CUDA Python as well as higher-level constructs that make it easy to map array-oriented code to the parallel architecture of the GPU.
NumbaPro which is part of the Anaconda Python distribution from Continuum analytics provides support for programming the GPU from the high-level language Python. There are two APIs. The first provides a high-level functional approach wherein NumPy array expressions can be compiled automatically to execute in parallel on the GPU. This API can also vectorize scalar functions to operate on arrays stored on the GPU. The low-level API provides CUDA support in Python. This "CUDA-Python" dialect makes it easier to access shared-memory and synchronization primitives directly using a simplified Python syntax. Together NumbaPro provides an easier interface for unleashing the power of GPUs using Python with NumPy arrays. (Coauthored by Siu Kwan Lam).
GPUs can offer orders of magnitude speed-ups for certain calculations, but programming the GPU remains difficult. Using NVIDIA's new support of LLVM, Continuum Analytics has built an array-oriented compiler for Python called Numba that can target the GPU. In this talk, I will demonstrate how Numba makes programming the GPU as easy as a one-line change to working Python code.