필터 지우기
필터 지우기

Optomize a vectorized code

조회 수: 2 (최근 30일)
Christopher
Christopher 2014년 11월 10일
편집: Christopher 2014년 11월 10일
I have a vectorized code. It runs fast (I think), but I would benefit enormously from reducing its computational cost. I have attached a picture of the result of the MATLAB profiler and the text of the code is appended:
This is a small part of a much larger code but it is a huge bottleneck.
Some matrix sizes: brdzeros: 4x4 CLa_wbrd: 128x128 CLa: 120x120 Qx_La, CPLa_t, etc.: 120x120
As you can see, the steps of the code are:
1. [ln. 978-986] Pad a matrix with periodic borders.
2. [ln. 990-1006] create matrices which are offset by 1 or -1 columns or 1 or -1 rows
3. [ln. 1009-1025] divide matrices by another preallocated matrix
4. [ln. 1029-1042] solve a finite difference equation using the above preallocated matrices
5. [ln. 1044-1045] get a ratio of some results.
Are there faster ways of doing the same computations? Could I use GPU acceleration?
Any ideas or strategies are much appreciated.
Code:
for d=1:dstep
% new concentrations mapped to wbrd grid
CLa_wbrd = [brdzeros CLa(end-brd+1:end,:) brdzeros;
CLa(:,end-brd+1:end) CLa CLa(:,1:brd);
brdzeros CLa(1:brd,:) brdzeros];
CSm_wbrd = [brdzeros CSm(end-brd+1:end,:) brdzeros;
CSm(:,end-brd+1:end) CSm CSm(:,1:brd);
brdzeros CSm(1:brd,:) brdzeros];
CYb_wbrd = [brdzeros CYb(end-brd+1:end,:) brdzeros;
CYb(:,end-brd+1:end) CYb CYb(:,1:brd);
brdzeros CYb(1:brd,:) brdzeros];
% create offset matrices of size(O): concentrations
CLa_t = CLa_wbrd(5-1:end-4-1,5:end-4);
CLa_r = CLa_wbrd(5:end-4,5+1:end-4+1);
CLa_b = CLa_wbrd(5+1:end-4+1,5:end-4);
CLa_l = CLa_wbrd(5:end-4,5-1:end-4-1);
CLa_i = CLa_wbrd(5:end-4,5:end-4);
CSm_t = CSm_wbrd(5-1:end-4-1,5:end-4);
CSm_r = CSm_wbrd(5:end-4,5+1:end-4+1);
CSm_b = CSm_wbrd(5+1:end-4+1,5:end-4);
CSm_l = CSm_wbrd(5:end-4,5-1:end-4-1);
CSm_i = CSm_wbrd(5:end-4,5:end-4);
CYb_t = CYb_wbrd(5-1:end-4-1,5:end-4);
CYb_r = CYb_wbrd(5:end-4,5+1:end-4+1);
CYb_b = CYb_wbrd(5+1:end-4+1,5:end-4);
CYb_l = CYb_wbrd(5:end-4,5-1:end-4-1);
CYb_i = CYb_wbrd(5:end-4,5:end-4);
% preallocated C/P's
CPLa_t = CLa_t./PmatLa_t;
CPLa_r = CLa_r./PmatLa_r;
CPLa_b = CLa_b./PmatLa_b;
CPLa_l = CLa_l./PmatLa_l;
CPLa_i = CLa_i./PmatLa_i;
CPSm_t = CSm_t./PmatSm_t;
CPSm_r = CSm_r./PmatSm_r;
CPSm_b = CSm_b./PmatSm_b;
CPSm_l = CSm_l./PmatSm_l;
CPSm_i = CSm_i./PmatSm_i;
CPYb_t = CYb_t./PmatYb_t;
CPYb_r = CYb_r./PmatYb_r;
CPYb_b = CYb_b./PmatYb_b;
CPYb_l = CYb_l./PmatYb_l;
CPYb_i = CYb_i./PmatYb_i;
% FLUXES, dC's, and new C's
Qx_La = ((DmatLa_l+DmatLa_i).*(CPLa_l-CPLa_i)+(DmatLa_r+DmatLa_i).*(CPLa_r-CPLa_i));
Qy_La = ((DmatLa_t+DmatLa_i).*(CPLa_t-CPLa_i)+(DmatLa_b+DmatLa_i).*(CPLa_b-CPLa_i));
dCLa = (Qx_La/twoxstp2+Qy_La/twoystp2)*dtimestep; % CHANGE IN HEAT CONTENT FROM HEAT FLUX
CLa = CLa+dCLa;
Qx_Sm = ((DmatSm_l+DmatSm_i).*(CPSm_l-CPSm_i)+(DmatSm_r+DmatSm_i).*(CPSm_r-CPSm_i));
Qy_Sm = ((DmatSm_t+DmatSm_i).*(CPSm_t-CPSm_i)+(DmatSm_b+DmatSm_i).*(CPSm_b-CPSm_i));
dCSm = (Qx_Sm/twoxstp2+Qy_Sm/twoystp2)*dtimestep; % CHANGE IN HEAT CONTENT FROM HEAT FLUX
CSm = CSm+dCSm;
Qx_Yb = ((DmatYb_l+DmatYb_i).*(CPYb_l-CPYb_i)+(DmatYb_r+DmatYb_i).*(CPYb_r-CPYb_i));
Qy_Yb = ((DmatYb_t+DmatYb_i).*(CPYb_t-CPYb_i)+(DmatYb_b+DmatYb_i).*(CPYb_b-CPYb_i));
dCYb = (Qx_Yb/twoxstp2+Qy_Yb/twoystp2)*dtimestep; % CHANGE IN HEAT CONTENT FROM HEAT FLUX
CYb = CYb+dCYb;
CLaSm = CLa./CSm;
CSmYb = CSm./CYb;
end

답변 (0개)

카테고리

Help CenterFile Exchange에서 MATLAB에 대해 자세히 알아보기

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by