Wrong result when calling CBLAS dgemv function in a mex file

조회 수: 3 (최근 30일)
Wei HU
Wei HU 2018년 3월 25일
편집: Jan 2018년 3월 26일
Code is as follows,
extern "C" {
#include "cblas.h"
#include "mex.h"
}
#include<iostream>
using namespace std;
void mexFunction(int nlhs,mxArray* plhs[], int nrhs, const mxArray* prhs[])
{
if(nrhs!=2) {
mexErrMsgTxt("Two input required.");
} else if(nlhs>1){
mexErrMsgTxt("Too many outputs!");
}
mwSignedIndex row0 = mxGetM(prhs[0]);
mwSignedIndex col0 = mxGetN(prhs[0]);
mwSignedIndex row1 = mxGetM(prhs[1]);
mwSignedIndex col1 = mxGetN(prhs[1]);
if(col1!=1){
mexErrMsgTxt("Column vector required!");}
if(col0!=row1){
mexErrMsgTxt("Dimension mismatch!");}
double* A = mxGetPr(prhs[0]);
double* b = mxGetPr(prhs[1]);
plhs[0] = mxCreateDoubleMatrix(row0,1,mxREAL);
double* y = mxGetPr(plhs[0]);
cout << row0 << "," << col0 <<endl;
ptrdiff_t b_stride = 1;
ptrdiff_t y_stride = 1;
cblas_dgemv(CblasColMajor, CblasNoTrans,
row0, col0,
1.0,
A, col0,
b, b_stride,
0.,
y, y_stride);
}
Then, if I call,
mv([1,2;3,4;5,6;7,8],[3;4])
it just give result,
ans =
0
0
0
0
However, it give correct result for 2 by 2 matrice,
mv([1,2;3,4;],[3;4])
ans =
11
25
Very confused, if I call cblas_dgemv in cpp and compile it with Visual Studio, everything just works well.
Any suggestion and help is appreciated!
I compile the Git version of OpenBLAS. Since it works well programming in C++, I believe there is nothing to do with the compiled library.
  댓글 수: 2
Wei HU
Wei HU 2018년 3월 25일
Similar thing also happen for using cblas_dgemm to do matrix product, e.g., 3x3 matrix multiply 3x3 matrix produce correct result while 2x3 with 3x2 will give wrong result.
Jan
Jan 2018년 3월 26일
편집: Jan 2018년 3월 26일
It would be easier if you explain, how you compile the code: I'm not using CBLAS, but the standard BLAS libraries shipped with Matlab usually. This does not allow the smart CblasColMajor method, such that I cannot reproduce your problem directly.
Maybe this is the bug:
cblas_dgemv(CblasColMajor, CblasNoTrans,
row0, col0,
1.0,
A, row0, % Not: col0
b, b_stride,
0.,
y, y_stride);

댓글을 달려면 로그인하십시오.

답변 (1개)

Jan
Jan 2018년 3월 26일
편집: Jan 2018년 3월 26일
This does not answer your question directly, but this is a C-Mex function to call DGEMM and DGEMV:
#include "mex.h"
// Management of trailing underscore in BLAS and LAPACK calls:
#if defined(__WINDOWS__) || defined(WIN32) || defined(_WIN32) || defined(_WIN64)
#define BLASCALL(f) f
#else
#define BLASCALL(f) f ## _
#endif
// Integers in BLAS calls:
// #define int_BLAS ptrdiff_t
#define int_BLAS mwSignedIndex
int BLASCALL(dgemm)(char *transa, char *transb, int_BLAS *m, int_BLAS *n,
int_BLAS *k, double *alpha, double *a, int_BLAS *lda,
double *b, int_BLAS *ldb, double *beta, double *c,
int_BLAS *ldc);
int BLASCALL(dgemv)(char *transa, int_BLAS *m, int_BLAS *n,
double *alpha, double *a, int_BLAS *lda,
double *b, int_BLAS *incb, double *beta, double *c,
int_BLAS *incc);
void mexFunction(int nlhs,mxArray* plhs[], int nrhs, const mxArray* prhs[])
{
int_BLAS mA, nA, mB, nB, i1 = 1;
double *A, *B, *C;
double d1 = 1.0, d0 = 0.0;
if (nrhs != 2) {
mexErrMsgTxt("Two inputs required.");
} else if (nlhs > 1) {
mexErrMsgTxt("Too many outputs!");
}
mA = (int_BLAS) mxGetM(prhs[0]);
nA = (int_BLAS) mxGetN(prhs[0]);
mB = (int_BLAS) mxGetM(prhs[1]);
nB = (int_BLAS) mxGetN(prhs[1]);
if (nA != mB) {
mexErrMsgTxt("Dimension mismatch!");
}
A = mxGetPr(prhs[0]);
B = mxGetPr(prhs[1]);
plhs[0] = mxCreateDoubleMatrix(mA, nB, mxREAL);
C = mxGetPr(plhs[0]);
// C = A * B:
if (nB == 1) {
BLASCALL(dgemv) ("N", &mA, &nA,
&d1, A, &mA, B, &i1, &d0, C, &i1);
} else {
BLASCALL(dgemm) ("N", "N", &mA, &nB,
&nA, &d1, A, &mA, B, &mB,
&d0, C, &mA);
}
return;
}
Compile this by:
mex -O DGEMM.c libmwblas.lib -largeArrayDims
For matix input, this has the same speed as calling the matrix multiplication in Matlab directly, while for matrix-vector multiplication Matlab is faster:
A = rand(200, 300);
B = rand(300, 200);
for k = 1:100; C = A*B; end; % Warmup!
tic; for k = 1:1000; C = A*B; end; toc
tic; for k = 1:1000; C = DGEMM(A, B); end; toc
A = rand(200, 300);
b = rand(300, 1);
tic; for k = 1:10000; C = A*b; end; toc
tic; for k = 1:10000; C = DGEMM(A, b); end; toc
Matlab R2016b, Core2Duo, Win7:
Elapsed time is 3.166592 seconds.
Elapsed time is 3.148328 seconds.
Elapsed time is 0.263720 seconds.
Elapsed time is 0.304909 seconds.
By the way:
  • A*B and DGEMM(A,B) use multiple cores.
  • Defining alpha for alpha*A * B let the C-Mex function be 2% faster than Matlab.
  • It took me 20 minutes to find out, that the default -compatibleArrayDims let the BLAS library crash, because it worked sometimes by accident. My conclusion: Linear Algebra functions are implemented efficiently in Matlab and I do not spend time in performing them in C.

카테고리

Help CenterFile Exchange에서 Sparse Matrices에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by