필터 지우기
필터 지우기

Cuda code generation using codegen command

조회 수: 1 (최근 30일)
Bhushan Ravindra Attarde
Bhushan Ravindra Attarde 2020년 5월 16일
Hi , I am generating cuda code for following equation :
function [out] = My_Fcn(A,B,C,D,X,U) %#codgen
n = size(U,2);
DU = D*U(:,1:n);
out = coder.nullcopy(zeros(size(U)));
coder.gpu.kernelfun();
for i=1:1:n
X(:,i+1) = A*X(:,i) + B*U(:,i);
end
out = C*X(:,1:n) + DU;
end
And I am generating code using following method :
A = double(zeros(3));
B = double(zeros(3));
C = double(zeros(3));
D = double(zeros(3));
X = double(zeros(3,4));
U = double(zeros(3));
cfg = coder.gpuConfig('lib');
codegen -args {A,B,C,D,X,U} -config cfg My_Fcn -report
When I am generating with 3*3 matrix, I am getting correct kernels as follows :
#include "My_Fcn.h"
#include "MWCudaDimUtility.hpp"
// Function Declarations
static __global__ void My_Fcn_kernel1(const double U[9], const double B[9],
const double X[12], const int i, const double A[9], double b_A[3], double b_B
[3]);
static __global__ void My_Fcn_kernel2(const double B[3], const double A[3],
const int i, double X[12]);
static __global__ void My_Fcn_kernel3(const double U[9], const double D[9],
const double X[12], const double C[9], double b_C[9], double b_D[9]);
static __global__ void My_Fcn_kernel4(const double D[9], const double C[9],
double out[9]);
But when I am increasing the size of matrices for example 100*100, The kernels created are as follows:
// Include Files
#include "My_Fcn.h"
#include "MWCudaDimUtility.hpp"
#include "My_Fcn_data.h"
#include "My_Fcn_initialize.h"
// Type Definitions
#include "cublas_v2.h"
// Function Declarations
static __global__ void My_Fcn_kernel1(double DU[10000]);
static __global__ void My_Fcn_kernel2(double C[100]);
static __global__ void My_Fcn_kernel3(double C[100]);
static __global__ void My_Fcn_kernel4(const int i, double C[100], double b_C[100],
double X[10100]);
static __global__ void My_Fcn_kernel5(double out[10000]);
static __global__ void My_Fcn_kernel6(double DU[10000], double out[10000]);
static cublasHandle_t getCublasGlobalHandle();
Here I cannot pass A,B,C,D,X,U matrices. I want to use this kernels to run ptx file on GPU, but as these kernels are not generating properly, I am not able to proceed.
Can you please help me with solution or workaround?
Thank You

답변 (0개)

카테고리

Help CenterFile Exchange에서 Get Started with GPU Coder에 대해 자세히 알아보기

제품


릴리스

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by