Remove NaN Entries in a Dataset
조회 수: 18 (최근 30일)
I have a data set with multiple variables (var1, var2, var3). Any of the variables can have a NaN values. If one value for one of the variables is a NaN, I'd like to remove that instance for each variable. I have the following code that does this, but is very slow:
count = 1;
for j = 1:length(var1)
if isnan(var1(j)) == 1 || ...
isnan(var2(j)) == 1 || ...
isnan(var3(j)) == 1
var1_cull(count,1) = var1(j);
var2_cull(count,1) = var2(j);
var3_cull(count,1) = var3(j);
count = count + 1;
How can I modify this routine to have the speed up the data removal?
I've tried using the find() function, but I haven't been able to get it to work when checking to see if each variable is a nan.
Vilém Frynta 2023년 1월 20일
편집: Vilém Frynta 님. 2023년 1월 20일
You do not need to go through every element of vector with for loop.
You can use isnan on the vector, which will give you logical vector (0s and 1s), which you can then use on all the other vectors. You can also combine logical vectors together.
I will try my best to demonstrate:
% Random vars for demonstration
var1 = [1, 2, 3, NaN, 5, 6];
var2 = [9, NaN, 7, 6, 5, 4];
% 1 = NaN values, we will invert this later (~)
idx1 = isnan(var1);
idx2 = isnan(var2)
% Combine logical vectors
idx = idx1 | idx2
% Apply logical vector to your variables