Cuda code generation using codegen command

조회 수: 1 (최근 30일)

이전 댓글 표시

Bhushan Ravindra Attarde 2020년 5월 16일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/525918-cuda-code-generation-using-codegen-command

Hi , I am generating cuda code for following equation :

function [out] = My_Fcn(A,B,C,D,X,U) %#codgen

n = size(U,2);

DU = D*U(:,1:n);

out = coder.nullcopy(zeros(size(U)));

coder.gpu.kernelfun();

for i=1:1:n

X(:,i+1) = A*X(:,i) + B*U(:,i);

end

out = C*X(:,1:n) + DU;

end

And I am generating code using following method :

A = double(zeros(3));

B = double(zeros(3));

C = double(zeros(3));

D = double(zeros(3));

X = double(zeros(3,4));

U = double(zeros(3));

cfg = coder.gpuConfig('lib');

codegen -args {A,B,C,D,X,U} -config cfg My_Fcn -report

When I am generating with 3*3 matrix, I am getting correct kernels as follows :

#include "My_Fcn.h"

#include "MWCudaDimUtility.hpp"

// Function Declarations

static __global__ void My_Fcn_kernel1(const double U[9], const double B[9],

const double X[12], const int i, const double A[9], double b_A[3], double b_B

[3]);

static __global__ void My_Fcn_kernel2(const double B[3], const double A[3],

const int i, double X[12]);

static __global__ void My_Fcn_kernel3(const double U[9], const double D[9],

const double X[12], const double C[9], double b_C[9], double b_D[9]);

static __global__ void My_Fcn_kernel4(const double D[9], const double C[9],

double out[9]);

But when I am increasing the size of matrices for example 100*100, The kernels created are as follows:

// Include Files

#include "My_Fcn.h"

#include "MWCudaDimUtility.hpp"

#include "My_Fcn_data.h"

#include "My_Fcn_initialize.h"

// Type Definitions

#include "cublas_v2.h"

// Function Declarations

static __global__ void My_Fcn_kernel1(double DU[10000]);

static __global__ void My_Fcn_kernel2(double C[100]);

static __global__ void My_Fcn_kernel3(double C[100]);

static __global__ void My_Fcn_kernel4(const int i, double C[100], double b_C[100],

double X[10100]);

static __global__ void My_Fcn_kernel5(double out[10000]);

static __global__ void My_Fcn_kernel6(double DU[10000], double out[10000]);

static cublasHandle_t getCublasGlobalHandle();

Here I cannot pass A,B,C,D,X,U matrices. I want to use this kernels to run ptx file on GPU, but as these kernels are not generating properly, I am not able to proceed.

Can you please help me with solution or workaround?

Thank You

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

답변 (0개)

이 질문에 답변하려면 로그인하십시오.

카테고리

Code Generation GPU Coder Get Started with GPU Coder

Help Center 및 File Exchange에서 Get Started with GPU Coder에 대해 자세히 알아보기

제품

MATLAB

릴리스

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by