CUDA Streams

Overview

  • Tutorial: 20 min

    Objectives:
    • Learn how to use CUDA streams in CuPy.

Key Points

  • CUDA events are used to measure the time between different points in your CUDA code.

CUDA streams are sequences of operations that are executed on the GPU in the order they are issued. Streams allow for concurrent execution of operations, which can improve performance by overlapping computation and data transfers.

Here’s an example of how to use CUDA streams in CuPy:

 1import numpy as np
 2import cupy as cp
 3
 4a_np = np.arange(10)
 5s = cp.cuda.Stream()
 6
 7with s:
 8    a_cp = cp.asarray(a_np)  # H2D transfer on stream s
 9    b_cp = cp.sum(a_cp)      # kernel launched on stream s
10
11# or we can use 'use()'
12# if we use 'use()' any subsequent CUDA operation will be completed
13# using the stream we specify, until we make a change
14s.use()
15
16b_np = cp.asnumpy(b_cp)
17
18assert s == cp.cuda.get_current_stream()
19
20# go back to the default stream
21cp.cuda.Stream.null.use()
22
23assert s == cp.cuda.get_current_stream()  # run fails if assert condition is false
24                                          # generates an error

Explanation:

  1. Create Arrays:

    `python a_np = np.arange(10) s = cp.cuda.Stream() ` A NumPy array a_np is created, and a CUDA stream s is initialized.

  2. Using the Stream:

    ```python with s:

    a_cp = cp.asarray(a_np) # H2D transfer on stream s b_cp = cp.sum(a_cp) # kernel launched on stream s

    ``` Within the context of the stream s, the NumPy array a_np is transferred to the GPU as a_cp, and a sum operation is performed on a_cp.

  3. Using `use()` Method:

    `python s.use() b_np = cp.asnumpy(b_cp) ` The use() method sets the stream s as the current stream. Any subsequent CUDA operations will use this stream until changed. The result b_cp is transferred back to the host as b_np.

  4. Assertions:

    `python assert s == cp.cuda.get_current_stream() ` This assertion checks that the current stream is s.

  5. Reverting to Default Stream:

    ```python cp.cuda.Stream.null.use() assert s == cp.cuda.get_current_stream() # run fails if assert condition is false

    # generates an error

    ``` The default stream is set as the current stream, and an assertion checks that the current stream is no longer s.

This example demonstrates how to use CUDA streams in CuPy to manage concurrent execution of operations on the GPU.

Key Points

  • CUDA streams allow for concurrent execution of operations on the GPU.

  • Streams can be used to overlap computation and data transfers for improved performance.

  • The use() method sets the current stream for subsequent CUDA operations.