Hello,
I have a tough situation in hands.
File format (sample file attached)
day_of_year | solar height | d1 | d2 | d3 | d4
I need to apply a correction to d1 to d4 based on solar height.
For each day I need to fing the average value of d1, d2, d3 and d4 for:
1- once solar height hits -18 (degrees) in the morning I need to average the previous 60 data values (if this happens on cell 101, i need the average of cells 40 to 100)
2 -once solar height hits -18 (degrees) in the afternoon/night I need to average the following 60 data values
3 - Obtain the average value between 1 and 2 for each day of the year.
Note: Solar Height is negative before sunrise and after sunset, being positive during the day.
I don't have really an idea on how to proceed from here...
This file has 59 days, so the result would be a matrix 59*4 results.
thanks for the attention

 채택된 답변

Guillaume
Guillaume 2019년 6월 28일

1 개 추천

Here is how I'd do it. First create a function that takes a Mx5 matrix of {solar_height, d1 ... d4} data for a single day and return your desired 1x4 output for that day:
function dayaverage = processday(solardata)
%solardata: matrix whose first column is the solar height. A MxN matrix
%dayaverage: the average solar data for the day as 1x(N-1) vector
%... explanation of algorithm goes here
morningstart = find(solardata(:, 1) >= -18, 1); %find where solar height first goes above -18
assert(morningstart > 60, 'morning starts too early to be able to average the previous 60 rows');
morningaverage = mean(solardata(morningstart-60:morningstart-1, 2:end), 1);
afternoonend = find(solardata(:, 1) >= -18, 1, 'last'); %find where solar height is last above -18
assert(afternoonend < size(solardata, 1) - 60, 'afternoon ends too late to be able to average the next 60 rows');
afternoonaverage = mean(solardata(afternoonend+1:afternoonend+60, 2:end), 1);
dayaverage = (morningaverage + afternoonaverage) / 2;
end
Then import your data into a table:
solardata = readtable('test_data.txt');
solardata.Properties.VariableNames = {'day_of_year', 'solar_height', 'd1', 'd2', 'd3', 'd4'};
head(solardata) %if you want to preview the table
It's then trivial to call the above function for each day:
dailysolardata = rowfun(@processday, solardata, 'GroupingVariable', 'day_of_year', 'SeparateInputs', false, 'OutputVariableName', 'd')
if you prefer individual dn variable in the output instead of Mx4 variable:
dailysolardata = splitvars(dailysolardata, 'd', 'NewVariableNames', compose('d%d', 1:4))

댓글 수: 7

Osnofa
Osnofa 2019년 7월 9일
thank you for the input.
Tried this one and the obtained result is for an average year, 366 days only.
Meanwhile, the other answer seems to be able to work with some change.
Thank you for your time!
Guillaume
Guillaume 2019년 7월 9일
The other answer has a few efficiency issues (growing array in a loop) and at least one bug (using length on a matrix). I've not looked into it in details to see if there was more issue with it.
I have no idea what you mean by the obtained result is for an average year, 366 days only.
The result is a table with 59 rows, just as you asked. If it doesn't do what you want then you need to give more details.
Osnofa
Osnofa 2019년 7월 9일
편집: Osnofa 2019년 7월 11일
you are right. I gave you an sample file with only 59 days and meanwhile I was using the whole file. My bad, sorry for the confusion.
Please find a new file (2 years of data) in the following link (bigger than the allowed 5mb):
I'm using 4.5 years, but I imagine that for testing purposes 2 years is enough.
Using 4.5 years the dailysolardata array has the following size: 366*3
I've used the first option you gave: https://ufile.io/9iufdg9j
dailysolardata = rowfun(@processday, solardata, 'GroupingVariable', 'day_of_year', 'SeparateInputs', false, 'OutputVariableName', 'd')
Am I missing something?
Thanks
Well, you need a way to differentiate days of each year. The simplest way is to add a year variable to the table and use that as an additional grouping variable. The processday function itself doesn't need to change:
solardata = readtable('testdata.txt');
%the number of columns is different in this new file. The following will adapt to however many columns there are
solardata.Properties.VariableNames = [{'day_of_year', 'solar_height'}, compose('d%d', 1:width(solardata)-2)];
solardata.year = cumsum([1; diff(solardata.day_of_year) < 0]); %add a year variable
dailysolardata = rowfun(@processday, solardata, 'GroupingVariables', {'year', 'day_of_year'}, 'SeparateInputs', false, 'OutputVariableName', 'd')
%optional, to go back to having separate d variables:
dailysolardata = splitvars(dailysolardata, 'd', 'NewVariableNames', solardata.Properties.VariableNames(3:end-1))
Osnofa
Osnofa 2019년 7월 11일
First of all, thanks for your time.
That solves the referred issue. However, I'm still seeing 2 issues:
  1. gettin NaN values in dailysolardata output. I've changed the mean to nanmean in the determination of morningaverage and afternoonaverage - within the fuction. Tried also to do that in the determination of dayaverage but without success - still getting NaN outputs.
  2. I've compared the dailysolardata results for day 7 (First output with value) and it is not correct - still have not found what may be wrong.
Regarding 1, in the first days of the dataset there are no values for the referred period before sunrise, only after sunset. In this case, the final average value would be the average of the 60 values after sunset when solar height >=-18.
Guillaume
Guillaume 2019년 7월 11일
편집: Guillaume 2019년 7월 11일
1) is easy to fix by replacing the last line of the function with:
dayaverage = mean([morningaverage; afternoonaverage], 'omitnan');
You'll still get NaN if both morning and afternoon are NaN. You may or may not want to add the 'omitnan' option to the calculation of morningaverage and afternoonaverage. But if you do, there will be less than 60 values used for the mean calculation when NaNs are present.
Note that there's no use for nanmean anymore, the mean function can omit nan as shown, and doesn't require any toolbox.
As for 2, what values where you expecting and which range of rows did you expect to be averaged for morning and afternoon of day 7?
Osnofa
Osnofa 2019년 7월 12일
Thanks.
meanwhile I had solved 1. I used nanmean instead of mean with the omitnan flag.
Regarding 2, I was performing the double check on a slightly different file (other tests). It was correct, my bad.
Just a note:
This process takes way much more time compared to the other proposed solution, but it works and it servers the purpose. For now I don't need efficiency, only to do the task.
Thanks for the solution :)

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Shubham Gupta
Shubham Gupta 2019년 6월 28일

1 개 추천

I assumed that you have test_data in a mat file, for this answer I am assuming the data stored in the variable named 'TestData'. I think you should achieve the desired result with the following code :
Ind = find(abs(TestData(:,2)+18)<0.097); % Finding indices that are close to Solar height of -18 (deg)
d_SH = [0;diff(TestData(:,2))]; % Variation in solar height
%% Loop to store average values
for i=1:length(Ind)
j = Ind(i);
if d_SH(j)>0 % If solar height is increasing ( Sunrise )
Out(i,:) = mean(TestData(j-60:j,3:6));
else % If solar height is decreasing ( Sunset )
Out(i,:) = mean(TestData(j:j+60,3:6));
end
end
%% Averaging Sunrise and Sunset data
for k=1:length(Out)/2
Desired_out(k,:) = mean(Out(2*k-1:2*k,:));
end

댓글 수: 1

Osnofa
Osnofa 2019년 7월 9일
편집: Osnofa 2019년 7월 11일
Hello,
Thanks for the input and sorry for the late reply. I've been testing this and found some inconsistency.
Ind = find(abs(TestData(:,2)+18)<0.097); % Finding indices that are close to Solar height of -18 (deg)
This indice is not always correct. how did you choose the 0.097 value?
the number of values obtained for Ind is weird because it is 3852, being 2 per day it means it should be 1926 days. However, the dataseries has 1642 days. I'm using the complete data that can be found here: https://ufile.io/9iufdg9j
The indice is not allways correct, sometimes it passed the target value and others is ok.
Thanks.

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Gravitation, Cosmology & Astrophysics에 대해 자세히 알아보기

태그

질문:

2019년 6월 27일

댓글:

2019년 7월 12일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by