Confusion about performance improvement by memory preallocation
조회 수: 3 (최근 30일)
이전 댓글 표시
I have a complex task of extracting relevant measurement data from a stack of mat files (each has 237 variables). I have 30 mat files, the files 9 to 18 are relevant (can not delete them, let's just accept this here). In each relevant file, only a few data points shall be extracted (for each variable), their index is given by a separate file (containing the measurement times).
The meas_times.mat looks like this, where each column corresponds to a file (9:18) and the rows contain start and stop times in alternating order.
0 0 0 0 0 0 0 0 315 0
0 0 0 0 0 0 0 0 352 0
0 0 0 0 56 0 26 0 373 0
0 0 0 0 115 0 45 0 394 0
0 0 0 0 141 121 65 515 476 1
0 0 0 0 200 180 104 574 511 8
0 0 0 0 471 201 132 609 837 44
0 0 0 0 530 260 191 611 860 95
0 0 0 0 610 721 443 664 881 109
0 0 0 0 669 780 502 720 910 154
0 461 0 0 671 796 521 737 961 174
0 520 0 0 730 821 580 796 990 187
0 591 0 171 1216 1072 711 1126 1071 580
0 650 0 230 1275 1105 770 1185 1130 610
0 981 21 276 1311 1197 773 1241 1171 667
0 1040 80 335 1336 1256 832 1300 1200 695
508 1061 121 721 1629 1376 1390 1324 1311 783
567 1120 142 780 1662 1435 1449 1383 1340 842
601 1721 151 816 1677 1451 1482 1411 1721 1151
660 1780 188 875 1736 1510 1541 1470 1780 1210
Example:
Of file 9 (1st column), I only want data from index 508-567 and 601-660 in each variable. For file 10, 21-80 etc. ...
My first solution with consecutive for-loops was:
tic
%% initialize
clear
clc
load('meas_times.mat')
times = B;
%% define path
path = 'C:\Users\felix\Documents\HeatFlow\Messkampagne Itterbeck\Messdaten\';
list = dir(path);
%% create empty struct t to collect relevant data in
t = struct(load(sprintf('%s%s',path,list(9).name)));
fn = fieldnames(t);
for a = 1:numel(fn)
t.(fn{a})=[];
end
%% filter relevant data
for b = 9:18
% extract files from 8th to 18th, b is their position in the folder
Dateiname = sprintf('%s%s',path,list(b).name);
s = struct(load(Dateiname)); % this is the b-th file as struct
fn = fieldnames(s);
if fn{1} == "Abtastrate_1_HzZeit_1_Hz_" % removes a duplex variable that some files have
s = rmfield(s,'Abtastrate_1_HzZeit_1_Hz_');
fn = fn(2:end);
end
v = nonzeros(times(:,b-8)); % % extract the measurement times that correspond to the b-th file
for z = 1:115 % 1-Hz variables from row 1 to row 115
temp = []; % empty dummy array
for i = 1:2:length(v)
temp = vertcat(temp,s.(fn{z})(v(i):v(i+1)));
end
t.(fn{z}) = vertcat(t.(fn{z}),temp); % write extracted values into the new struct to the z-th variable
end
for z = 116:154 % 2-Hz variables up to row 154
% preprocessing: mean of 2Hz
for i = 1:2:numel(s.(fn{z}))-1
s.(fn{z})(i:i+1) = mean(s.(fn{z})(i:i+1));
end
s.(fn{z})(2:2:end) = [];
temp = []; % empty dummy array
for i = 1:2:length(v)
temp = vertcat(temp,s.(fn{z})(v(i):v(i+1)));
end
t.(fn{z}) = vertcat(t.(fn{z}),temp); % write extracted values into the new struct to the z-th variable
end
end
toc
However, this ran very slowly, because the last variables in each file contain a few million values (damn 1kHz loggers). As Matlab gave me the hint, that the temp array was changing size with every loop iteration and this would hurt performance. So I completely rewrote the whole script, included functions and preallocated memory. However, now the whole operation takes thrice as much time. Here is the "improved" script. Have a made any obvious mistake?
tic
%% initialize
clear
clc
load('meas_times.mat')
times = B;
%% define path
path = 'C:\Users\felix\Documents\HeatFlow\Messkampagne Itterbeck\Messdaten\';
list = dir(path);
%% create empty struct t to collect relevant data in
t = struct(load(sprintf('%s%s',path,list(9).name)));
fn = fieldnames(t);
for a = 1:numel(fn)
t.(fn{a})=[];
end
%% filter relevant data
for b = 9:18
% extract files from 8th to 18th, b is their position in the folder
filename = sprintf('%s%s',path,list(b).name);
s = struct(load(filename)); % this is the b-th file as struct
fn = fieldnames(s);
if fn{1} == "Abtastrate_1_HzZeit_1_Hz_" % removes a duplex variable that some files have
s = rmfield(s,'Abtastrate_1_HzZeit_1_Hz_');
fn = fn(2:end);
end
v = nonzeros(times(:,b-8)); % extract the measurement times that correspond to the b-th file
w = [[0,0]';diff(v)+1]; % important for indexing later on
for z = 1:154 % 154 variables in the struct s
switch z
case num2cell(1:115)
hz = 1;
t.(fn{z}) = vertcat(t.(fn{z}),selectdata(hz,z,v,w,s,fn));
case num2cell(116:154)
hz = 2;
t.(fn{z}) = vertcat(t.(fn{z}),selectdata(hz,z,v,w,s,fn));
end
end
end
toc
function temp = selectdata(hz,z,v,w,s,fn)
% preprocessing
for i = 1:numel(s.(fn{z}))/hz
s.(fn{z})(i:i+(hz-1)) = mean(s.(fn{z})(i:i+(hz-1)));
s.(fn{z})(i+1:i+(hz-1)) = [];
end
% empty dummy array
temp = zeros(sum(w(1:2:end)),1);
% extract data into temp
for i = 1:2:length(v)
temp(sum(w(i:-2:1))+1:sum(w(i+2:-2:1)),1) = s.(fn{z})(v(i):v(i+1));
end
end
Feel free to ask questions. I do not want help on my project, I want to understand why the code got SLOWER when I followed Matlabs instructions and changed the code in a way that it preallocated memory.
The 1st code took 9.7 seconds, the 2nd code approximately 31 seconds.
댓글 수: 9
Bruno Luong
2020년 12월 11일
편집: Bruno Luong
2020년 12월 11일
Yes that's what I have in mind. Might be something about allocation of u you could improve, but it's not much a big deal and it's a detail you can work on later.
Now what you could do outside this loop on u to build a single group should go something like this (you need to look for comma list MATLAB syntax to understand the code):
% test data, replace with your u array of structure
u(1) = struct('a', [0;1], 'b', [2;3])
u(2) = struct('a', [4], 'b', [5;6;7])
f = fieldnames(u);
data = cellfun(@(f) vertcat(u.(f)), f, 'unif', 0);
sarg = [f,data].';
sall = struct(sarg{:})
채택된 답변
Jan
2020년 12월 11일
Try to replace this part of the first solution:
for z = 1:115 % 1-Hz variables from row 1 to row 115
temp = []; % empty dummy array
for i = 1:2:length(v)
temp = vertcat(temp,s.(fn{z})(v(i):v(i+1)));
end
t.(fn{z}) = vertcat(t.(fn{z}),temp); % write extracted values into the new struct to the z-th variable
end
by:
for z = 1:115 % 1-Hz variables from row 1 to row 115
lenv = numel(v);
tmpC = cell(lenv / 2); % empty dummy array
sfnz = s.(fn{z}); % Cheap shared data copy instead of repeated indexing
for k = 1:lenv / 2
idx = 2 * k - 1;
tmpC{k} = sfnz(v(idx):v(idx + 1));
end
t.(fn{z}) = vertcat(t.(fn{z}), tmpC{:}); % write extracted values into the new struct to the z-th variable
end
The iterative growing or shrinking of arrays is extremely expensive. So avoid things like this:
s.(fn{z})(i+1:i+(hz-1)) = [];
A tiny exmple:
x = [];
for k = 1:1e6
x(k) = k;
end
This does not request 8MB (8 byte per double), but sum(1:1e6)*8MB, because in each iteration a new array is created the old contents is copied. This means more than 4 TB of RAM! Of course this is slow. With a pre-allocation and without a growing array, Matlab requests the expected 8MB only: x = zeros(1, 1e6). The same effect applies for an iterative shrinking.
In Matlab 2018b vertcat has some potential for improvements. Then it might be idea to test the speed with https://www.mathworks.com/matlabcentral/fileexchange/28916-cell2vec .
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Matrix Indexing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!