Main Content


Generate CUDA code for stencil functions

Since R2022b


    [OUT_1,...,OUT_N] = stencilfun(func,X,W) applies the function func to each sliding window of size W of the input array X. stencilfun outputs a number of equally-sized output arrays equal to the number of outputs of func. Each call made to func computes a single element of each output array. The index of this element corresponds to the center of the sliding window in the input array.


    ___ = stencilfun(___,Name,Value) performs stencil operation by using the options specified by one or more Name,Value pair arguments.


    collapse all

    This example shows how to perform 2-D convolution of an array with a 5x5 filter by using the stencilfun function.

    Define the MATLAB® entry-point function.

    function Out = myconv(In, W)
    fh = @(X) stencilFcn(X, W);
    Out = stencilfun(fh, In, [5 5], Shape = 'same');

    The stencil function stencilFcn is defined as:

    function y = stencilFcn(X, W)
    y = 0;
    for j = 1:5
        for i = 1:5
            y = y + X(i,j) * W(i,j);

    Use codegen command to generate CUDA® code.

    Input Arguments

    collapse all

    Function to apply to the elements of the input arrays, specified as a function handle. func must take the following form:

    [Y_1, ..., Y_N] = FUNC(X1, IDX_1, ..., IDX_M);
    X1 denotes a subarray of X of size W. It is possible that the sliding window corresponding to X1 may span outside the boundary of X. In this case, X1 will be padded with a constant value wherever necessary. Each parameter IDX_K denotes the index along dimension K of the output element currently being computed and has type int32. func can accept any number of IDX values, but this number must be fixed-size. Each parameter Y_K must be a scalar value of supported data type, which computes a single output element of the K-th output array. The type of value Y_K will determine the type of array OUT_K. func is permitted to be an anonymous or nested function. In this case, func may reference additional variables that are in scope where func is defined. This behavior is useful if the stencil operation requires additional parameters in addition to the input window.

    Example: out = stencilfun(fh, X, [5 5]);

    Data Types: function_handle

    X must be an array of supported data type.

    Example: out = stencilfun(fh, In, [5 5]);

    Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | logical | half

    W must be a numeric row vector where W(D) specifies the window size along dimension D. The dimensionality of the input array and window can be arbitrary and do not need to be equal.

    Example: out = stencilfun(fh, In, [5 5], Shape = 'same');

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

    Example: out = stencilfun(fh, In, [5 5], Shape = 'same');

    Specifies the size of each output array by determining the amount of padding to be applied on each dimension of the input. 'Shape' must be specified as one of these values:

    • 'full' — Maximum padding will applied to the input array such that each sliding window will access at least one element of the unpadded input array.

    • 'same' — The input array will be padded so that an output array and the input array will have the same size (assuming no striding is used).

    • 'valid' — The input array will not be padded

    Example: out = stencilfun(fh, In, [5 5], Shape = 'same');

    Specify the preprocess function to apply to all elements of the input array (including padding elements) before performing the stencil computation. 'Preprocess' must be a function handle that takes the following form:

    Y = preprocess(X);
    where X denotes an input element and Y denotes the preprocessed result. By default, no preprocessing is applied.

    Example: out = stencilfun(fh, In, [5 5], Preprocess = @(x) single(3.14*x));

    Specify the step size for traversing the input array when sliding the window in each dimension. 'Stride' must be a numeric scalar or a row vector. If 'Stride' is a scalar, then that value specifies the stride for each dimension. If 'Stride' is a vector, then Stride(D) specifies the stride along dimension D. The default value of 'Stride' is 1.

    Example: out = stencilfun(fh, In, [5 5], Stride = 2);

    Specifies the value to use for padding the input array. 'PaddingValue' must have the same class as the input array. The default value of 'PaddingValue' is 0.

    Example: out = stencilfun(fh, In, [5 5], PaddingValue = 1);

    Output Arguments

    collapse all

    The number of outputs equal the number of outputs of func. All the output arrays are of the same size.


    • It is recommended not to use toolbox functions such as sum inside the callback function func. When iterating through elements of the window, explicit loops should be used instead.

    • When indexing the window parameter of the callback function func, each index operation must access only a single element of the window. Linear indexing is not supported.

    • The callback must always be inlined using coder.inline('always').

    • The window parameter must not be modified inside the callback.

    Version History

    Introduced in R2022b