Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

This example shows how to generate CUDA® code from a simple MATLAB® function by using GPU Coder™. A Mandelbrot set implementation by using standard MATLAB commands acts as the entry-point function. This example uses the `codegen`

command to generate a MEX function that runs on the GPU. You can run the MEX function to check for run-time errors.

CUDA enabled NVIDIA® GPU with compute capability 3.2 or higher.

NVIDIA CUDA toolkit.

Environment variables for the compilers and libraries. For more information, see Environment Variables.

The following line of code creates a folder in your current working folder (pwd), and copies all the relevant files into this folder. If you do not want to perform this operation or if you cannot generate files in this folder, change your current working folder.

```
gpucoderdemo_setup('gpucoderdemo_mandelbrot');
```

Use the coder.checkGpuInstall function and verify that the compilers and libraries needed for running this example are set up correctly.

```
envCfg = coder.gpuEnvConfig('host');
envCfg.BasicCodegen = 1;
envCfg.Quiet = 1;
coder.checkGpuInstall(envCfg);
```

The Mandelbrot set is the region in the complex plane consisting of the values for which the trajectories defined by

remain bounded at . The overall geometry of the Mandelbrot set is shown in the figure. This view does not have the resolution to show the richly detailed structure of the fringe just outside the boundary of the set.

For this tutorial, pick a set of limits that specify a highly zoomed part of the Mandelbrot set in the valley between the main cardioid and the bulb to its left. A `1000x1000`

grid of and is created between these two limits. The Mandelbrot algorithm is then iterated at each grid location. An iteration number of 500 is enough to render the image in full resolution.

maxIterations = 500; gridSize = 1000; xlim = [-0.748766713922161, -0.748766707771757]; ylim = [ 0.123640844894862, 0.123640851045266]; x = linspace( xlim(1), xlim(2), gridSize ); y = linspace( ylim(1), ylim(2), gridSize ); [xGrid,yGrid] = meshgrid( x, y );

The mandelbrot_count.m function contains a vectorized implementation of the Mandelbrot set based on the code provided in the e-book *Experiments in MATLAB* by Cleve Moler. The %#codegen directive turns on MATLAB for code generation error checking. When GPU Coder encounters the `coder.gpu.kernelfun`

pragma, it attempts to parallelize all the computation within this function and then maps it to the GPU.

```
type mandelbrot_count
```

% getting started example (mandelbrot_count.m) function count = mandelbrot_count(maxIterations, xGrid, yGrid) %#codegen % mandelbrot computation z0 = xGrid + 1i*yGrid; count = ones(size(z0)); % Map computation to GPU coder.gpu.kernelfun; z = z0; for n = 0:maxIterations z = z.*z + z0; inside = abs(z)<=2; count = count + inside; end count = log(count);

`mandelbrot_count`

Run the `mandelbrot_count`

function with the xGrid, yGrid values that were previously generated and plot the results.

count = mandelbrot_count(maxIterations, xGrid, yGrid); figure(2), imagesc( x, y, count ); colormap( [jet();flipud( jet() );0 0 0] ); title('Mandelbrot Set on MATLAB'); axis off

To generate CUDA MEX for the `mandelbrot_count`

function, create a GPU code configuration object and use the `codegen`

function. Because of architectural differences between the CPU and GPU, numerical verification does not always match. This scenario is specially true when using single data type in your MATLAB code and performing accumulation operations on these single data type values. However, there are cases like this Mandelbrot example where even double data types cause numerical errors. One reason for this mismatch is that the GPU floating point units use fused Floating-point Multiply-Add (FMAD) instructions while the CPU does not use these instructions. The `fmad=false`

option that is passed to the `nvcc`

compiler turns off this FMAD optimization.

cfg = coder.gpuConfig('mex'); cfg.GpuConfig.CompilerFlags = '--fmad=false'; codegen -config cfg -args {maxIterations,xGrid,yGrid} mandelbrot_count

After you generate a MEX function, you can verify that it has the same functionality as the original MATLAB entry-point function. Run the generated `mandelbrot_count_mex`

and plot the results.

countGPU = mandelbrot_count_mex(maxIterations, xGrid, yGrid); figure(2), imagesc( x, y, countGPU ); colormap( [jet();flipud( jet() );0 0 0] ); title('Mandelbrot Set on GPU'); axis off

In this example, CUDA code was generated for a simple MATLAB function implementing the Mandelbrot set. Implementation was accomplished by using the `coder.gpu.kernelfun`

pragma and invoking the `codegen`

command to generate MEX function. Additional compiler flags, namely FMAD=false was passed to the `nvcc`

compiler to disable the FMAD optimization that the NVIDIA compilers perform.

Remove the generated files and return to the original folder.

cleanup