Vec-trick implementation (multiple times)

Question

0 개 추천

Dear all,

the question is related to Tensorproduct. Since the question was not answered as intended, i want to revisit the question.

Introduction:

Suppose you have a matrix vector multiplication, where a matrix C with size (np x mq) is constructed by a Kronecker product of matrices A with size (n x m) and B with size (p x q). The vector is denoted v with size (mp x 1) or its vectorized version X with size (m x p).

In two dimensions this operation can be performed with O(npq+qnm) operations instead of O(mqnp) operations, see Wikipedia.

Expensive variant (in case of flops):

Cheap variant (in case of flops):

Main question:

I want to perform many of these operations at ones, e.g. 2500000. Example: n=m=p=q=7 with A=size(7x7), B=size(7x7), v=size(49x2500000).

In Tensorproduct i have implemented a MeX-C version of the cheap variant which is quite slower than a Matlab version of the expensive variant provided by Bruno Luong.

Is it possible to implement the cheap version in Matlab without looping?

댓글 수: 5
이전 댓글 3개 표시 이전 댓글 3개 숨기기

Bruno Luong 2021년 8월 23일

Because smaller flops doesn't mean necessary faster. Memory access, cache, thread management are as well important, and which is fatest method probably depends on n=m=p=q.

ConvexHull 2021년 8월 23일

편집: ConvexHull 2021년 8월 23일

Yeah that's definitly the case here.

The main problem is that, if you want to perform the Vec-trick multiple times in a vectorized fashion you have to reorder the datastructure. After applying AX you cannot perform a Matrix-Matrix multiplication directly with B.

Stupid Memory access O.o!

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

활동을 팔로우하려면 로그인

Answer 1

ConvexHull 2021년 8월 24일

편집: ConvexHull 2021년 8월 25일

MATLAB Online에서 열기

0 개 추천

Here is a pure intrinsic Matlab version without loops, however with two transpose operations and quite slow.

n=7;m=7;p=7;q=7;
A = rand(n,m);
B = rand(p,q);
v = rand(m*p,500000,5);
n = 5;
C = kron(B,A);
tic
for i=1:n
    v1 = reshape(C*reshape(v,49,[]),size(v));
end
toc % Elapsed time is 0.456353 seconds
tic
for i=1:n
    v2 = reshape(reshape(B*reshape((A*reshape(v,7,[])).',7,[]),7*2500000,[]).',7,[]);
end
toc % Elapsed time is 3.879752 seconds
max(abs(v1(:)-v2(:))) 
% 1.4211e-14

댓글 수: 22
이전 댓글 20개 표시 이전 댓글 20개 숨기기

ConvexHull 2021년 8월 24일

편집: ConvexHull 2021년 8월 24일

MATLAB Online에서 열기

I don't know what you mean.

The ().' is far the most expensive operation no matter what is being done in the background.
Reshape is for free.
The small 7er matrix-matrix multiplication is cheaper than the 49er big one.
By the way ()' or ().' are nearly same expensive.

n=7;m=7;p=7;q=7;
A = rand(n,m);
B = rand(p,q);
v = rand(m*p,500000,5);
n = 5;
tic
for i=1:n
    vv = reshape(v,7,[]); %#ok<*NASGU>
end
toc % Elapsed time is 0.000186 seconds
tic
for i=1:n
    vvv = A*vv;
end
toc % Elapsed time is 0.350487 seconds
tic
for i=1:n
    vvvv = (vvv).';
end
toc % Elapsed time is 1.682334 seconds
tic
for i=1:n
    vvvvv = reshape(vvvv,7,[]);
end
toc % Elapsed time is 0.000181 seconds
tic
for i=1:n
    vvvvvvv = B*vvvvv;
end
toc % Elapsed time is 0.347840 seconds
tic
for i=1:n
    vvvvvvvv = reshape(vvvvvvv,7*2500000,[]);
end
toc % Elapsed time is 0.000174 seconds
tic
for i=1:n
    vvvvvvvvv = (vvvvvvvv).';
end
toc % Elapsed time is 1.470868 seconds
tic
for i=1:n
    vvvvvvvvvv = reshape(vvvvvvvvv,7,[]);
end
toc % Elapsed time is 0.000148 seconds

Bruno Luong 2021년 8월 26일

MATLAB Online에서 열기

Add benchmark with mtimesx

Conclusion

For version before R2020b, use expensive method for s < 44, use mtimesx otherwise;
For version R2020b or later, use expensive method for s < 27, use pagemtimes otherwise.

stab = 5:5:100;
t1 = zeros(size(stab));
t2 = zeros(size(stab));
t3 = zeros(size(stab));
t4 = zeros(size(stab));
for i = 1:length(stab)
    fprintf('%d/%d\n', i, length(stab));
    s = stab(i);
    n=s;
    m=s;
    p=s;
    q=s;
    
    A = rand(n,m);
    B = rand(p,q);
    v = rand(m*p,100000);
    
    tic
    C = kron(B,A);
    v1 = reshape(C*reshape(v,s*s,[]),size(v));
    t1(i) = toc;
    
    tic
    v2 = reshape(reshape(B*reshape((A*reshape(v,s,[])).',s,[]),[],s).',s,[]);
    t2(i) = toc;
    
    tic
    X = reshape(v, size(A,2), size(B,1), []);
    v3 = pagemtimes(pagemtimes(A, X), 'none', B, 'transpose');
    t3(i) = toc;
    
    tic
    X = reshape(v, size(A,2), size(B,1), []);
    v4 = mtimesx(mtimesx(A, X), 'N', B, 'T');
    t4(i) = toc;
end
close all
semilogy(stab, [t1; t2; t3; t4]');
legend('Expensive method', ...
    'Cheap method using transposition', ...
    'Cheap method using pagemtimes', ...
    'Cheap method using mtimesx');
xlabel('s');
ylabel('time [sec]');
grid on;

Stefano Cipolla 2023년 9월 14일

편집: Stefano Cipolla 2023년 9월 14일

Hi there! May I ask if you are aware of implementation of functions similar to "pagemtimes" but able to work with at least one sparse input? Alternatively do you see any easy workaround? More precisely I need someting like

pagemtimes(A, V)

where A is a nxnxn sparse real tensor and V is a real dense nxn matrix...

Bruno Luong 2023년 9월 14일

편집: Bruno Luong 2023년 9월 14일

MATLAB Online에서 열기

@Stefano Cipolla "sparse real tensor"

I'm not aware this native MATLAB class.

But you can put the A as diagonal block of n^2 x n^2 sparse matrix

SA = [A(:,:,1) 0         0 ... 0
      0       A(:,:,2)  0 ... 0
      ...
      9=0      0 ...           A(:,:,n)]
    

Do the same expansion for V (with the same block) then solve it

댓글을 달려면 로그인하십시오.

Vec-trick implementation (multiple times)

댓글 수: 5
이전 댓글 3개 표시 이전 댓글 3개 숨기기

채택된 답변

댓글 수: 22
이전 댓글 20개 표시 이전 댓글 20개 숨기기

추가 답변 (0개)

카테고리

제품

릴리스

태그

Community Treasure Hunt

Vec-trick implementation (multiple times)

댓글 수: 5 이전 댓글 3개 표시 이전 댓글 3개 숨기기

채택된 답변

댓글 수: 22 이전 댓글 20개 표시 이전 댓글 20개 숨기기

추가 답변 (0개)

카테고리

제품

릴리스

태그

참고 항목

Community Treasure Hunt

댓글 수: 5
이전 댓글 3개 표시 이전 댓글 3개 숨기기

댓글 수: 22
이전 댓글 20개 표시 이전 댓글 20개 숨기기