Combine non integer time steps into daily values

조회 수: 72 (최근 30일)
Cristóbal
Cristóbal 2025년 10월 14일 15:21
댓글: Torsten 2025년 10월 15일 21:34
Dear Experts,
I have the following data, which have time and volume (.mat file attached).
time (Days) Volume (L)
0 0
30.6741806 0
1.168E-05 0.006798073
2.2995E-05 0.047292948
4.4165E-05 0.223091253
7.7015E-05 0.833100076
9.636E-05 2.066383935
0.00012045 4.343111658
0.000150745 8.336749383
0.000187975 14.55932459
0.000289445 22.84538025
0.00036135 32.30381193
Unfortunately, the time steps are not consistent, and I would like to group them into values of one day. As volume is linked to those time steps, I can not add the time steps, as they don't sum 1.
With some effort in Excel I could group the values, with their respective proportional volume:
time (Days) Volume (L)
0.97022767 70827.51996
0.02977233 1854.371465
0.07209406 4490.383105
0.7850486 48689.812711
0.1428574 9532.3878414
If I sum the bold and underlined ones, I get tha volume values for 1 day (integer)
time (Days) Volume (L)
1 72681.89142
1 62712.58366
As there are so many values (more than 900), Is it possible to do it on a propper way in MATLAB?
Thank you!
EDITED underlined values
  댓글 수: 6
Star Strider
Star Strider 대략 16시간 전
편집: Star Strider 대략 14시간 전
I cannot make any sense of that.
How did you decide on those particular values and ranges?
What criteria define the 'time' ranges?
(I experimented by summing the consecutive time values using cumsum and then summing the volume values that corresponded to the first time sum that was less than or equal to 1. My results were not at all similar to yours.)
EDIT --
I am giving up on even trying to understand this.
I have deleted my Answer.
.
Cristóbal
Cristóbal 대략 14시간 전
Ranges are given by the sum until I reach 1, beginning from 1.168E-5 as the previous value 30.6741806 has no volume.
For the 49 values that follows 30.6741806, I sum all the 49 values (rows 4 to 52 in Excel file): 1.168E-5 + 2.2995E-5 + 4.4165E-5 + ... + 0.084265725 + 0.092457785 = 0.97022767
The next value (row 53 in Excel file) is 0.10186639, but I can't add it to the result 0.97022767 as it gives a number greater than 1 (1.07209406). So, instead, I just take the part of 0.10186639 I need to reach 1, that is 0.02977233.
So 0.97022767 + 0.02977233 = 1 day.
Associated volume for each row is added or taken proportional, so I have the volume for that 1 day.
The remaining of the value 0.10186639, that is 0.07209406 is used in the following group. So I have that, and I have the sum of row 54 to 59 (0.10856925 + 0.1156904 + ... + 0.146811395 + 0.1513363 = 0.7850486).
Those two again gives less than 1 when added: 0.07209406 + 0.7850486 = 0.8571426
So I take the proportion of the following row, as again if I used the complete value gives a number greater than 1. Row 60 is 0.152159375, so I take only 0.1428574 because
0.07209406 + 0.7850486 + 0.1428574 = 1 day
The remaining of the value 0.152159375, that is 0.0093020 is used in the following group. And again, associated volume for each row is added or taken proportional, so I have the volume for that 1 day.
I hope this clarify something. Thank you for taking the time to analyse this =)

댓글을 달려면 로그인하십시오.

답변 (1개)

Torsten
Torsten 2025년 10월 15일 0:14
편집: Torsten 2025년 10월 15일 0:29
Looking at the volume values, I'm almost sure that these values are already cumulative values. Why should the increase in volume for later times take such enormous values within relatively small timespans ? But I compute both variants below.
Thus use "cumsum" for the first column of your data to determine the actual time and leave the second column as is or also apply cumsum to it. Then use interp1 to interpolate the actual value for the volume to daily values.
LD = load('timesteps.mat');
Tcum = cumsum(LD.time);
V = LD.volume;
Vcum = cumsum(LD.volume);
T_dayly = Tcum(1):floor(Tcum(end));
V_dayly = interp1(Tcum,V,T_dayly);
Vcum_dayly = interp1(Tcum,Vcum,T_dayly);
figure(1)
plot(T_dayly,V_dayly)
grid on
figure(2)
plot(T_dayly,Vcum_dayly)
grid on
  댓글 수: 7
Umar
Umar 대략 13시간 전
% Robust approach to handle duplicate or near-duplicate time values
% Load your data
load('timesteps.mat');
% Calculate cumulative time
Tcum = cumsum(time);
Vcum = volume;
%% Method 1: Remove exact duplicates using uniquetol
% This handles values that are "close enough" (within tolerance)
tolerance = 1e-10; % Adjust based on your precision needs
[Tcum_unique, unique_idx] = uniquetol(Tcum, tolerance, 'DataScale', 
1);
Vcum_unique = Vcum(unique_idx);
fprintf('Original data points: %d\n', length(Tcum));
fprintf('After removing duplicates: %d\n', length(Tcum_unique));
fprintf('Removed %d duplicate/near-duplicate points\n\n', length(Tcum)
- length(Tcum_unique));
%% Method 2: Average values at duplicate time points (alternative   approach)
% This preserves information if you have legitimate duplicates with   different volumes
[Tcum_unique2, ~, ic] = uniquetol(Tcum, tolerance, 'DataScale', 1);
Vcum_unique2 = accumarray(ic, Vcum, [], @mean);
%% Proceed with interpolation using cleaned data
T_daily = 0:1:floor(Tcum_unique(end));
Vcum_daily = interp1(Tcum_unique, Vcum_unique, T_daily, 'linear', 
'extrap');
% Calculate daily increments
V_daily = diff(Vcum_daily);
T_daily_increments = T_daily(2:end);
%% Verification
fprintf('Final cumulative volume: %.2f L\n', Vcum_daily(end));
fprintf('Sum of daily increments: %.2f L\n', sum(V_daily));
fprintf('Original final volume: %.2f L\n', Vcum(end));
fprintf('Difference: %.2f L (%.4f%%)\n\n', ...
  Vcum(end) - Vcum_daily(end), ...
  100*abs(Vcum(end) - Vcum_daily(end))/Vcum(end));
%% Additional quality check: identify problematic duplicates
% Find time differences between consecutive points
time_diffs = diff(Tcum);
small_diffs = time_diffs < 1e-6; % Flag very small time steps
if any(small_diffs)
  fprintf('Warning: Found %d time intervals smaller than 1e-6 days\n',     sum(small_diffs));
  fprintf('First few occurrences at indices: %s\n', ...
      mat2str(find(small_diffs, 5)'));
    % Show examples
    idx_examples = find(small_diffs, 3);
    if ~isempty(idx_examples)
        fprintf('\nExample near-duplicates:\n');
        for i = 1:length(idx_examples)
            idx = idx_examples(i);
            fprintf('  Point %d: Time=%.12f, Volume=%.2f\n', idx, 
            Tcum(idx), Vcum(idx));
            fprintf('  Point %d: Time=%.12f, Volume=%.2f\n', idx+1, 
            Tcum(idx+1), Vcum(idx+1));
            fprintf('  Difference: %.2e days\n\n', time_diffs(idx));
        end
    end
  end
%% Create results table
results_table = table(T_daily_increments', V_daily', ...
  'VariableNames', {'Time_Days', 'Daily_Volume_L'});
% Display first 50 rows
fprintf('First 50 daily volumes:\n');
disp(results_table(1:min(50, height(results_table)), :));
%% Visualization
figure('Position', [100 100 1200 500]);
subplot(1,2,1)
plot(T_daily, Vcum_daily, 'b-', 'LineWidth', 1.5)
hold on
plot(Tcum_unique, Vcum_unique, 'r.', 'MarkerSize', 4)
xlabel('Time (Days)')
ylabel('Cumulative Volume (L)')
title('Cumulative Volume Over Time')
legend('Interpolated Daily', 'Original Data', 'Location', 'northwest')
grid on
subplot(1,2,2)
plot(T_daily_increments, V_daily, 'b-', 'LineWidth', 1)
xlabel('Time (Days)')
ylabel('Daily Volume Increment (L)')
title('Daily Volume Changes')
grid on
ylim([0 max(V_daily)*1.1]) % Better visualization
%% Function to export results
% Uncomment to save results
% writetable(results_table, 'daily_volumes_cleaned.xlsx');
% fprintf('Results exported to daily_volumes_cleaned.xlsx\n');

Note: please see attached results.

Torsten
Torsten 대략 12시간 전
What improvement could be done in the case data has repeated values in x to use interp1 (i.e., x vector not unique)?
Before applying interp1, sort out almost equal data points using "uniquetol".

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Vector Volume Data에 대해 자세히 알아보기

제품


릴리스

R2025a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by