Interpolation of in-between values in a list of different groups

조회 수: 4 (최근 30일)
Clemens Gersch
Clemens Gersch 2020년 3월 30일
편집: Guillaume 2020년 3월 31일
Hi,
I have a dataset zerocd, from which I uploaded an extract. The variable zerocd.days can be understand as days in the future from the date of the same row. What I want to do is to interpolate the rates for days = 1:100 on each date. The interpolation should be based on the rate vector of each zerocd.date, so that for a specific date the rates of alle all other dates are irrelevant. The interpolation should be done for every date in the dataset.
As there are 3 different dates in my extract, the goal is to have a table in the same structure as zerocd, but it should contain 300 rates, one for every combination of date (the three dates given) and days(1:100).
Please keep in mind, that the values of zerocd.days are not the same for every zerocd.date!!!
Right now, my code looks like this and is very very slow. Do you have suggestions for improvement?
% Get unique zerocd dates.
date = unique(zerocd.date);
n_date = numel(date);
% Get query vector for interpolation.
days_queried = (1:100)';
% Construct new table for interpolated values.
rate = nan(n_date, numel(days_queried));
Interpolated = table(date, rate);
Interpolated = splitvars(Interpolated, 'rate');
% Do rate-interpolation for all days on each date i
for i = 1:numel(date)
daily_zerocd = zerocd(zerocd.date == Interpolated.date(i),:);
daily_Interpolant = griddedInterpolant(daily_zerocd.days, ...
daily_zerocd.rate, 'linear', 'linear');
Interpolated{i,2:end} = daily_Interpolant(days_queried)';
end
% Stack to list format again.
Interpolated = stack(Interpolated,2:size(Interpolated,2),...
'NewDataVariableName','rate',...
'IndexVariableName','days');
Interpolated.days = repmat(Xq,n_date,1);
EDIT: Here is a picture of the general principle.

채택된 답변

Guillaume
Guillaume 2020년 3월 31일
편집: Guillaume 2020년 3월 31일
I've not tried to understand your code to see where the slow processing is (edit: it's probably the stack) There's no reason for this processing to be slow.
Here is how I'd do it. The splitapply could be replaced by an explicit loop which may even be faster:
% Get query vector for interpolation.
days_queried = (1:100)';
%find unique date and assign unique ID to rows of the table
[rowid, date] = findgroups(zerocd.date); %or [date, ~, rowid] = unique(zerocd.data); %if using unique
%interpolate relevant rows for each unique day
%with splitapply, output a scalar cell array containing a column vector of interpolated rate for each day.
daily_zerocd = splitapply(@(inday, inrate) {interp1(inday, inrate, days_queried, 'linear', 'extrap')}, zerocd.days, zerocd.rate, rowid);
%stuff it all in a table
Interpolated = table(repelem(date, numel(days_queried)), repmat(days_queried, numel(date), 1), vertcat(daily_zerocd{:}), ...
'VariableNames', {'date', 'days', 'rate'})

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Interpolation에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by