How to preallocate memory for storing data in same mat file?

조회 수: 2 (최근 30일)
Sunny
Sunny 2018년 10월 20일
댓글: Guillaume 2018년 10월 26일
Hi, I wrote the below code and I would like to preallocate memory so that the code will run faster. Once I preallocate I know that I cannot use append but need to index to store output. Can you suggest how to get output for code below?
Here the value of f is a 1*5449 double. Final output is 5449*5449 double.
clc;
n=1; %system order
m=1; %number of inputs
p=6;%number of outputs
Final = [];
for i = 1:7783
for j = 1:50
if exist(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat'],'file')
load(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat']);
A1 = A{1};
A1 = A1 / max(abs(eig(A1)));
B1 = B{1};
C1 = C{1};
index = 1;
for k = 1:7783
for l = 1:50
if exist(['ID_',num2str(k),'_file_',num2str(l),'_Variables','.mat'],'file')
load(['ID_',num2str(k),'_file_',num2str(l),'_Variables','.mat']);
A2 = A{1};
A2 = A2 / max(abs(eig(A2)));
B2 = B{1};
C2 = C{1};
f(index) = distance1_matlab(A1,A2,B1,B2,C1,C2);
index = index + 1;
end
end
end
Final = [Final;f];
end
end
end
save('Distance','Final');
  댓글 수: 5
Sunny
Sunny 2018년 10월 21일
편집: Sunny 2018년 10월 21일
Thanks. I changed the program to this. I think this is faster. A is 10*10 double, B is 1*10 and C is 6*10. Now the structs f, o and g are 1*5449.
clc;
n=10; %system order
m=1; %number of inputs
p=6;%number of outputs
Final = [];
k = 1;
for i = 1:7783
for j = 1:50
if exist(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat'],'file')
load(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat']);
f{k} = A{1};
o{k} = B{1};
g{k} = C{1};
k = k+1;
end
end
end
save('Rescaled_A_Values_All_States','f');
save('Rescaled_B_Values_All_States','o');
save('Rescaled_C_Values_All_States','g');
for c = 1:5449
A1 = f{c};
A1 = A1 / max(abs(eig(A1)));
B1 = o{c};
C1 = g{c};
index = 1;
for d = 1:5449
A2 = f{d};
A2 = A2 / max(abs(eig(A2)));
B2 = o{d};
C2 = g{d};
q(index) = distance1_matlab(A1,A2,B1,B2,C1,C2);
index = index + 1;
end
Final = [Final;q];
end
Guillaume
Guillaume 2018년 10월 21일
Well, yes it's going to be much faster. You're reading each file only once. You're still doing N^2 unnecessary eigs and related calculations. And nearly 99% of the files you test for existence don't exist, so it'd be faster to do a dir so the OS just tells you which files are there.
Finally, depending on what distance1_matlab does, it may well be that your 2nd loop is not needed.

댓글을 달려면 로그인하십시오.

채택된 답변

Guillaume
Guillaume 2018년 10월 21일
Depending on what distance1_matlab does, this code could be significantly improved.
I'm also assuming that all files that match the pattern ID_*_file_*_Variables.mat' need to be loaded.
filelist = dir('ID_*_file_*_Variables.mat'); %get list of files that exist
fileids = regexp({filelist.name}, 'ID_(\d+)_file_(\d+)_', 'tokens', 'once') %extract numeric ids as text
fileids = str2double(vertcat(fileids{:})); %and convert to numeric
%you may want to sort fileids and filelist to match the order of your original loops
%it's trivial to do. For now I assume it does not matter.
filedata = struct('A', cell(numel(filelist), 1), 'B', [], 'C', []); %preallocate structure to receive file content and final result
%note that A, B and C are very poor field names.
for fileiter = 1:numel(filelist)
filecontent = load(filelist(fileiter).name));
filedata(fileiter).A = filecontent.A{1} / max(abs(eig(A{1})));
filedata(fileiter).B = filecontent.B{1};
filedata(fileiter).C = filecontent.C{1};
end
[cartprod1, cartprod2] = ndgrid(filedata); %cartesian product of all files with themselves
distance = arrayfun(@(s1, s2) distance1_matlab(s1.A, s2.A, s1.B, s2.B, s1.C, s2.C), cartprod1, cartprod2); %assumes that the result of distance1_matlab is scalar
Note that that last line assumes distance1_matlab returns a scalar. If not, change it to:
distance = arrayfun(@(s1, s2) distance1_matlab(s1.A, s2.A, s1.B, s2.B, s1.C, s2.C), cartprod1, cartprod2, 'UniformOutput', false);
If you want the result in the same form as your original Final, then:
distance = distance(:); %if scalar result out of
distance = vertcat(distance{:}); %otherwise
  댓글 수: 2
Sunny
Sunny 2018년 10월 26일
@Guillaume
Can I use parfor instead of for to speed up execution with parallel processing? Does the loops synchronize?
Guillaume
Guillaume 2018년 10월 26일
I doubt that using parfor for the loading loop would help much. The slow part of that is not the processor but the disk access. If anything, it's possible that parfor will slow things down as parallel threads compete for disk access. You'll only know if you try.
I don't know if the parallel toolbox can parallelise arrayfun (I don't have the toolbox). arrayfun is a for loop in disguise. Parallelising that code could certainly result in a speed-up
However, as I've said (twice now) depending on what distance_matlab does, it's likely that this 2nd loop/arrayfun is not needed at all and that the function can be vectorised. This would probably be the most efficient way to improve your code. Hence why I asked for the details of this function.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Whos에 대해 자세히 알아보기

제품


릴리스

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by