Skip to Scheduled Dates
Course Overview
This workshop teaches you the fundamental tools and techniques for running GPU-accelerated Python applications using CUDA® GPUs and the Numba compiler. You’ll work though dozens of hands-on coding exercises and, at the end of the training, implement a new workflow to accelerate a fully functional linear algebra program originally designed for CPUs, observing impressive performance gains. After the workshop ends, you’ll have additional resources to help you create new GPU-accelerated applications on your own.
Who Should Attend
Developers who use Python
Course Objectives
- GPU-accelerate NumPy ufuncs with a few lines of code.
- Configure code parallelization using the CUDA thread hierarchy.
- Write custom CUDA device kernels for maximum performance and flexibility.
- Use memory coalescing and on-device shared memory to increase CUDA kernel bandwidth.
Course Outline
- Course Introduction
- Introduction to CUDA Python with Numba
- Custom CUDA Kernels in Python with Numba
- Multidimensional Grids and Shared Memory for CUDA Python with Numba
- Final Review