Optimized code for loop, If-statement for large dataset

조회 수: 1 (최근 30일)
Jhon Gray
Jhon Gray 2019년 5월 25일
편집: per isakson 2019년 5월 26일
I was hoping to delete some certain rows using condion. My data is in double format (790127*24) I approximate the total code need 25 hours using Run and Time which is huge. Is there any way of optimiing the script.
TIA...
n=0;
for i = 1 : length(d_A)
if any(isnan(d_A(i-n, 6))) ...
&& any(isnan(d_A(i-n, 7))) ...
&& any(isnan(d_A(i-n, 8))) ...
&& any(isnan(d_A(i-n, 9))) ...
&& any(isnan(d_A(i-n,10))) ...
&& any(isnan(d_A(i-n,11))) ...
&& any(isnan(d_A(i-n,12))) ...
&& any(isnan(d_A(i-n,13))) ...
&& any(isnan(d_A(i-n,14))) ...
&& any(isnan(d_A(i-n,15))) ...
&& any(isnan(d_A(i-n,16))) ...
&& any(isnan(d_A(i-n,17))) ...
&& any(isnan(d_A(i-n,18))) ...
&& any(isnan(d_A(i-n,19)))
d_A(i-n,:) = [];
n=n+1;
end
end

채택된 답변

per isakson
per isakson 2019년 5월 25일
편집: per isakson 2019년 5월 26일
Try this
%%
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
n=0;
for i = 1 : length(d_A)
if all( isnan( d_A( i-n, ixcol ))) % i-n is a scalar
d_A(i-n,:) = [];
n=n+1;
end
end
and this
%%
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
is_to_be_deleted = false( size(d_A,1), 1 );
n=0;
for i = 1 : length(d_A)
if all( isnan( d_A( i-n, ixcol ))) % i-n is a scalar
% d_A(i-n,:) = [];
is_to_be_deleted(i-n) = true;
n=n+1;
end
end
d_A( is_to_be_deleted, : ) = [];
Caveat: not tested
In response to comment:
Now it's possible to factor out the for-loop. Try this
%% Sample data
A = rand( [8,4] );
ixcol = [2,3];
A([3,5],ixcol) = nan;
A( randperm( numel(A), 9 ) ) = nan;
%%
[ A3, ix_deleted3 ] = cssm_3( A, ixcol );
[ A4, ix_deleted4 ] = cssm_4( A, ixcol );
ix_deleted3 == ix_deleted4 %#ok<NOPTS,EQEFF>
function [ A, ix_deleted ] = cssm_3( A, ixcol )
is_to_be_deleted = false( size(A,1), 1 );
for jj = 1 : length(A)
if all( isnan( A( jj, ixcol )))
is_to_be_deleted(jj) = true;
end
end
A( is_to_be_deleted, : ) = [];
ix_deleted = find( is_to_be_deleted );
end
function [ A, ix_deleted ] = cssm_4( A, ixcol )
is_to_be_deleted = all( isnan( A( :, ixcol ) ), 2 );
A( is_to_be_deleted, : ) = [];
ix_deleted = find( is_to_be_deleted );
end
it outputs
>> cssm
ans =
2×1 logical array
1
1
The vectorized version, cssm_4, might not improve performance significantly, but in my opinion it makes cleaner code.
  댓글 수: 2
Jhon Gray
Jhon Gray 2019년 5월 25일
편집: Jhon Gray 2019년 5월 25일
Wow. The second one is super fast.But there's a little bit problem here. The code would be like this.No need of i-n in this case.Thanks for helping.Take love.
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
is_to_be_deleted = false( size(d_A,1), 1 );
n=0;
for i = 1 : 1000 %length(d_A)
if all( isnan( d_A( i, ixcol ))) % i-n is a scalar
% d_A(i-n,:) = [];
is_to_be_deleted(i) = true;
%n=n+1;
end
end
d_A( is_to_be_deleted, : ) = [];
per isakson
per isakson 2019년 5월 26일
편집: per isakson 2019년 5월 26일
I surmised that there was a problem and added the last line in bold.
It's as a bad for performance to remove one line at a time as adding one line at a time. In both cases the matrix is rewritten to memory in each operation.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Matrices and Arrays에 대해 자세히 알아보기

제품


릴리스

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by