Deleting NaN from a large array
이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
이전 댓글 표시
0 개 추천
I have an extremely large array, and I am trying to delete every single NaN as the trials do not all have the same amount of variables. Is there any way to simply read the entirety of A and justs delete NaN? I have been trying to use isnan and I keep getting a deletion of everthing or only NaN back.
채택된 답변
Depends on what "extremely large" means. Let's start by assuming it's not too large for this to work. Furthermore, it depends on what "delete" means.
In this example, I assume a generic array and I assume "delete" means "replace with zero" for example.
% build example array with
s = [10000 10000];
A = zeros(s);
idx = randi([1 prod(s)],100,1);
A(idx) = NaN;
A(isnan(A)) = 0; % replace NaN with zero
sum(isnan(A),'all') % show that all the NaNs are gone
ans = 0
In this second example, I assume a vector and I assume "delete" means "remove this element and collapse the vector"
% build example array with
s = [1000000 1];
A = zeros(s);
idx = randi([1 prod(s)],100,1);
A(idx) = NaN;
A = A(~isnan(A)); % remove NaNs
sum(isnan(A),'all') % show that all the NaNs are gone
ans = 0
Note that element removal was only demonstrated for a vector. This is simply because you can't remove single elements from a 2 (or more) dimensional array and collapse the array accordingly. In other words, you can't have "holes" in an array. Only in the 1D case does removing a single element result in an unambiguous means to collapse the array.
If neither of these are close to what you need, or if your array is so large that these can't work, we'll work from there.
댓글 수: 5
Colton McGarraugh
2021년 10월 6일
Thank you! Unfortunately by "delete" I meant it removes the NaN completely from the column! That way when I got to do my for loop later, it stops and doesn't starting reading the Nan in the columns. And by extremely large, I mean a 6400x120 double. I tried doing
A=A(:,~all(isnan(A))) to see if that would remove them from all columns, but it doesn't do anything and the NaN are still there.
Colton McGarraugh
2021년 10월 6일
When I try your second example, it gives me A now as 336,720x1 double.
Let's go a bit further. The reason that the second example gives you a long vector is simply that it it vectorizes the result. If NaNs exist uncoordinated throughout a 2 (or more) dimensional array, that's the only representation that can generally contain the result of removing the NaNs.
Consider the array:
A = reshape(1:20,10,2);
A([5 12 17]) = NaN
A = 10×2
1 11
2 NaN
3 13
4 14
NaN 15
6 16
7 NaN
8 18
9 19
10 20
Removing the NaNs would result in a non-rectangular array. Such a thing isn't really possible, and even if it were, the correspondence between rows would be lost. Whether that information is important in your case, I don't know.
Applying the second example vectorizes the result as explained:
A = A(~isnan(A)) % remove NaNs
A = 17×1
1
2
3
4
6
7
8
9
10
11
Note that 17 is no longer integer-divisible by the original number of columns. If by chance each column contains the same number of NaNs and row correspondence is of no concern...
A = reshape(1:20,10,2);
A([5 8 12 17]) = NaN
A = 10×2
1 11
2 NaN
3 13
4 14
NaN 15
6 16
7 NaN
NaN 18
9 19
10 20
A = reshape(A(~isnan(A)),[],2)
A = 8×2
1 11
2 13
3 14
4 15
6 16
7 18
9 19
10 20
Though it's unlikely that such a case applies.
There is also the possibility that NaNs can be filled based on the surrounding data. Again, it depends whether that suits your needs.
Similarly, there is the possibility that NaN removal might be unnecessary if the subsequent processing can work around them.
Colton McGarraugh
2021년 10월 7일
Thank you so much for all your help! I think the problem is that I have a function that takes a google sheet and reads the data and puts it in Matlab. Because not every trial has the same time, a lot of the cells in columns are left empty. As a result, when the data got put into matlab, it made every column the same length, but when the data stopped, it made everything that was empty NaN.
I think it is impossible to make the columns remove the NaN, unless there is a way to do this:
Read every single column from top to bottom and as soon as it hits NaN, it deletes everything below that since it is only NaN and then move to the next column.
For reference, this is how I called my function.
exist A % Checks to See if The Array of Variables is Already in the Workspace
if ans == 1 % If The Array is in the Workspace, it prints the following:
fprintf('Lets get moving! The data is already here. \n')
else ans = 0 % If the array is NOT in the workspace, it calls it from the function.
fprintf('Grabbing the data now... Just one moment! \n')
A = GetGoogleSpreadsheet('1dqtn5aTdOIuhcLbz1coBazoInQ9a2XNJcb4rUf1BdqM'); % Calls the function GetGoogleSpreadsheet
A = str2double(A); % Converts the Cell into a Double
A(1,:) = []; % Deletes the first Row So Only the Measurements are Left (Deletes time, acceleration, position, etc from 1st row)
end
Part of my project is checking to make sure that the function doesn't read the google sheet again if its in the window, which is why I have the exist if/then.
Image Analyst
2021년 10월 7일
@Colton McGarraugh 6400 by 120 is far from large. It's just a small fraction of the size of a typical digital image. If it were 10k by 10k by 8 bytes, then we'd be approaching large.
But I question your original ask. Why do you think you need to "delete" nans in the first place? It might not be necessary depending on what you want to do. For example many functions like mean() have an 'omitnan' option. Plus maybe you could just replace the nan with the median of surrounding values, like I do in my attached salt and pepper noise removal demo. Like DGM said, you can't just remove them because then you'd have holes or "ragged" edges on the 2-D matrix, neither of which is allowed.
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Performance and Memory에 대해 자세히 알아보기
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
