How to manage NaN values and calculate mean under conditions

Question

Daphne PARLIARI 2021년 7월 21일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/883083-how-to-manage-nan-values-and-calculate-mean-under-conditions

답변: Peter Perkins 2021년 7월 27일

채택된 답변: Simon Chan

Daily data.xlsx

MATLAB Online에서 열기

Hi guys! I need your help on that.

In the attached file I have daily values of Temperature. What I want to calculate for every daily value is the Factor

Factor(i) = (Ti+Ti-1+Ti-2)/3

The problem starts when trying to "tidy up" NaN values:

If one of Ti,Ti-1,Ti-2 is NaN, then

Factor(i) = (Ti+Ti-1)/2 [assuming that Ti-2=NaN].

If two of Ti,Ti-1,Ti-2 are NaN, then

Factor(i) = Ti [assuming that Ti-2=NaN=Ti-1].

If all of Ti,Ti-1,Ti-2 are NaN, then

Factor(i) = 'NaN'

In the (most hopeful) case that none of the three are NaN, then

Factor(i) = (Ti+Ti-1+Ti-2)/3

Here is what I have done so far, but it doesnt work as expected

Daily_T = Imported_data.Tmean;
Daily_T = array2table(Daily_T);
[col] = height(Daily_T);
Factors = zeros(4018,2);
for i = 1:col  
 if (isnan(Daily_T{i,1}))
  Factors(i,1) = (1/2)* (Daily_T{i-1,1}+Daily_T{i-2,1});
 elseif (isnan(Daily_T{i-1,1})) 
  Factors(i,1) = (1/2)* (Daily_T{i,1}+Daily_T{i-2,1});
 elseif (isnan(Daily_T{i-2,1})) 
  Factors(i,1) = (1/2)* (Daily_T{i,1}+Daily_T{i-1,1});
 elseif (isnan(Daily_T{i,1})) && (isnan(Daily_T{i-1,1}))
  Factors(i,1) =   Daily_T{i-2,1}
 elseif   (isnan(Daily_T{i,1})) && (isnan(Daily_T{i-2,1}))
  Factors(i,1) =   Daily_T{i-1,1} 
 elseif (isnan(Daily_T{i-1,1})) && (isnan(Daily_T{i-2,1}))
  Factors(i,1) =   Daily_T{i,1}
 elseif (isnan(Daily_T{i,1})) && (isnan(Daily_T{i-1,1})) && (isnan(Daily_T{i-2,1}))
  Factors(i,1) = 'NaN';
 else
  Factors(i,1) = (1/3)* (Daily_T{i,1}+Daily_T{i-1,1}+Daily_T{i-2,1});
 end
end

However, Factors are not built as it should... Can anyone point out where is the flaw of my code please?

PS. I'm on Matlab 2019a

댓글 수: 2
없음 표시없음 숨기기

KSSV 2021년 7월 21일

Read about nanmean. Also there are options to give omitnan while calculating means. Read the documentation.

Daphne PARLIARI 2021년 7월 21일

How can I use nanmean in (let's say)

Factors(i,1) = (1/2)* (Daily_T{i-1,1}+Daily_T{i-2,1}) ?

When I tried earlier, I got error.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Simon Chan 2021년 7월 21일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/883083-how-to-manage-nan-values-and-calculate-mean-under-conditions#answer_751213

편집: Simon Chan 2021년 7월 21일

MATLAB Online에서 열기

Try the following:

summation = sum for each group of data (Each group has 3 data)

notnandata = count the number of data which are not NaN. So if the entire group of data contains only NaN, it will output zero. And if this value is zero, set it to NaN.

Noticed that Factor(1) calculate the mean value for rawdata.Tmean(1:3). If you don't like this pattern, you may need to adjust the indexing yourself.

rawdata=readtable('Daily data.xlsx');
idx = 1:size(rawdata.Tmean,1)-2;
summation = arrayfun(@(x) sum(rawdata.Tmean(x:x+2),'omitnan'),idx,'UniformOutput',false);
notnandata = cell2mat(arrayfun(@(x) sum(~isnan(rawdata.Tmean(x:x+2))),idx,'UniformOutput',false));
notnandata(notnandata==0)=NaN;
Factor = cell2mat(summation)./notnandata

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Simon Chan 2021년 7월 22일

편집: Simon Chan 2021년 7월 22일

MATLAB Online에서 열기

Yes, the size for 'Factor' in your previous example is 29*1 only, because the size was deducted by 2 in the code ( idx = 1:size(rawdata.Tmean,1)-2)

So for your new data, it can be modified a little bit as follows. Note that the code just calculate the entire data in the column and you need to define which date is Ti.

If Ti is 2 Feb 2006, Ti-3 to Ti-32 would be 30 Jan 2006 to 1 Jan 2006 if I understand correctly and this is the first data in 'Factor30'.

Of course, using movmean suggested by dpb is much simpler and powerful.

rawdata=readtable('Daily data (1).xlsx');
num_days = 30;                                      % Modify a little bit and you can assign number of days here
idx = 1:size(rawdata.Tmean,1)-num_days+1;
summation = arrayfun(@(x) sum(rawdata.Tmean(x:x+num_days-1),'omitnan'),idx,'UniformOutput',false);
notnandata = cell2mat(arrayfun(@(x) sum(~isnan(rawdata.Tmean(x:x+num_days-1))),idx,'UniformOutput',false));
notnandata(notnandata==0)=NaN;
Factor30 = cell2mat(summation)./notnandata

Daphne PARLIARI 2021년 7월 22일

The modification for Factor30 works! Thanks!

But what is the difference between your approach and

idx = 33:size(Dailydata1.Tmean,1);

summation = arrayfun(@(x) sum(Dailydata1.Tmean(x-32:x-3),'omitnan'),idx,'UniformOutput',false);

notnandata = cell2mat(arrayfun(@(x) sum(~isnan(Dailydata1.Tmean(x-32:x-3))),idx,'UniformOutput',false));

notnandata(notnandata==0)=NaN;

Factor30 = cell2mat(summation)./notnandata;

that I tried myself? Besides the length, which in my case is 3986*1 and in yours 3989*1 .

In addition, I would use movmean as dpb suggested but I don't know how to make it work for Factor30.

댓글을 달려면 로그인하십시오.

Answer 2

dpb 2021년 7월 21일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/883083-how-to-manage-nan-values-and-calculate-mean-under-conditions#answer_751218

편집: dpb 2021년 7월 21일

MATLAB Online에서 열기

M=movmean(Daily,[0 2],'omitnan');

or, for the specific file

>> tDaily=readtable('Daily data.xlsx');
>> tDaily.MTmean=movmean(tDaily.Tmean,[0,2],'omitnan');
>> head(tDaily)
ans =
  8×3 table
    Daily_Date     Tmean    MTmean
    ___________    _____    ______
    01-Jan-2006     9.33    11.68 
    02-Jan-2006    11.90    13.04 
    03-Jan-2006    13.80    12.76 
    04-Jan-2006    13.43    11.53 
    05-Jan-2006    11.05     9.92 
    06-Jan-2006    10.12     8.39 
    07-Jan-2006     8.59     7.52 
    08-Jan-2006     6.45     6.24 
>> 

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

Daphne PARLIARI 2021년 7월 22일

Regarding question 2:

[29 0] means that moving average starts from current day and extends 29 days before that, correct?

In Factor_2 (i) = (Ti-3+T-4-1+...+Ti-32)/30 I want to start 3 days before current day (Ti-3) and extend up until 32 days before current day (Ti-32).

Could M=movmean(Daily,[0 32],'omitnan'); and then discarding the first three averages do the trick?

Is this way of thinking efficient?

dpb 2021년 7월 22일

편집: dpb 2021년 7월 22일

"movmean(A,[kb kf]) computes the mean with a window of length kb+kf+1 that includes the element in the current position, kb elements backward, and kf elements forward."
From 1. above, ergo [0 32] would be from 0 to 32 past point i, not before.(*)

Again, "read the documentation" combined with experimenting with a sample dataset short enough to be able to watch the results; simply using 1:10 so can easily verify what the results should be/how are calculated would be an ideal debugging tool. You don't need all 32 to test what the various combinations are how how to manipulate the series to get what you're shooting for.

(*) Of course, you could fliplr() the series, then do the averaging and fliplr() back, but why not just put the point offset in in the correct order to begin with? The offset would be needed to use movmean with both elements negative; TMW didn't think of that possibility and won't accept anything <0 as the second argument. There's no real reason it couldn't; just that they didn't think of it -- all that would be required is that the start index be less than the ending one.

댓글을 달려면 로그인하십시오.

Answer 3

Peter Perkins 2021년 7월 27일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/883083-how-to-manage-nan-values-and-calculate-mean-under-conditions#answer_754434

MATLAB Online에서 열기

Another possibility that uses and preserves a timetable:

>> tt = readtimetable("Daily data.xlsx");
>> head(tt)
ans =
  8×1 timetable
    Daily_Date     Tmean 
    ___________    ______
    01-Jan-2006    9.3282
    02-Jan-2006    11.901
    03-Jan-2006    13.802
    04-Jan-2006    13.427
    05-Jan-2006    11.052
    06-Jan-2006    10.124
    07-Jan-2006    8.5933
    08-Jan-2006    6.4544
>> ttSm = smoothdata(tt,'movmean',days([0,2]),'omitnan');
>> head(ttSm)
ans =
  8×1 timetable
    Daily_Date     Tmean 
    ___________    ______
    01-Jan-2006    11.677
    02-Jan-2006    13.043
    03-Jan-2006     12.76
    04-Jan-2006    11.534
    05-Jan-2006    9.9229
    06-Jan-2006    8.3905
    07-Jan-2006    7.5239
    08-Jan-2006    6.2444

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

How to manage NaN values and calculate mean under conditions

댓글 수: 2
없음 표시없음 숨기기

채택된 답변

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (2개)

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

How to manage NaN values and calculate mean under conditions

댓글 수: 2 없음 표시없음 숨기기

채택된 답변

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (2개)

댓글 수: 4 이전 댓글 2개 표시이전 댓글 2개 숨기기

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기