Replace NaN's in timeseries with longterm median for specific dates
조회 수: 2 (최근 30일)
이전 댓글 표시
I have a multiyear timeseries with an hourly timestep that contains NaN's. I want to replace the NaN's with the median value for the specific hour, day and month calculated over all years. In the example below, I want to take the calculated median value in G and insert it into any matching hour with missing data in X. I have played around with ismember and vaious timetable options, but am stuck. Help appreciated!
% date range
t1 = datetime(2015,1,1,1,0,0);
t2 = datetime(2020,12,31,23,0,0);
t = (t1 : hours(1) : t2)';
%make up some data with random NaN's
X = rand(size(t));
idx = randsample(size(X,1),size(X,1)/3) ;
X(idx,:) = NaN;
%convert to timetable
T = timetable(t,X);
T.Month = month(T.t,'monthofyear');
T.Day = day(T.t,'dayofmonth');
T.Time = timeofday(T.t);
%calculate the median value of each hour in a year
G = groupsummary(T,{'Month','Day','Time'},'median','X');
%Where there are NaN's in X, insert the median value from G at the matching
%Month, Day and Time in each year
댓글 수: 0
채택된 답변
Luca Ferro
2023년 2월 24일
Try to see if this helps, it's conceptually the same thing but with the mean
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Data Preprocessing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!