Create a table with Revenue by financial year such as 2017-18

조회 수: 2 (최근 30일)
Philip Chindamo
Philip Chindamo 2019년 4월 3일
댓글: dpb 2019년 4월 12일
From the excel data attached I’m trying to create a table with Revenue by financial year such as 2017-18 (where financial year is the sum of the quarters September 2017 + December 2017 + March 2018 + June 2018).
The table below shows what I need to get to from the raw data I attach. I tried writing the code attached but this sums the Revenue for all periods to give me a total of 8.8 for the Code category ‘abc’.
Do you know how I can do this table as below?
Code 2017-18 2018-19
abc 4 4.8
def 6.2 6.4
ghi 8 12

채택된 답변

dpb
dpb 2019년 4월 4일
편집: dpb 2019년 4월 5일
t=readtable('chindamo.xlsx'); % read the input table data
t.Code=categorical(t.Code); % I just like categorical for such...
t.AcctYear=fix((t.Date-t.Date(1))/duration(24*365,0,0)); % compute index variable of accounting year
g=findgroups(t.AcctYear,t.Code); % get the grouping variable by year, code
fnRev=@(q,p) sum(q.*p); % define revenue summation functional
revenue=splitapply(fnRev,t.Quantity,t.Price,g); % and calculate for each group
revenue=reshape(revenue,numel(categories(t.Code)),[]); % orient by code for each year
returns the array of values
>> revenue=reshape(revenue,numel(categories(t.Code)),[])
revenue =
4.00 4.80
6.20 6.40
8.00 12.00
>>
from which you can build whatever structure you wish...
NB: the "trick" here is to compute the accounting year indicator variable as the integer portion of a duration of days. This relies on the data being ordered such that the first date of the desired year start month is the first element in the date vector as it simply is checking for the multiple of 365 days. It would take some fixup logic to deal with leap years.
The retime function with a timetable could be made to work by generating a specific new time vector; it's unfortunate that its builtin periods are based solely on a Jan1 - Dec31 definition of year start/end; it seems a logical extension to be able to define also the DOY that a fiscal year starts/ends; there certainly are sufficient uses for such. It's possible the Financial Toolbox might have such things; I don't have it and haven't searched...
ADDENDUM:
The general solution for the duration vector is:
t.AcctYear=fix((t.Date-t.Date(1))./duration(24*365+isleapyr(tt),0,0);
where
function is=isleapyr(yr)
% returns T for input year being a leapyear
if isdatetime(yr), yr=year(yr); end
is=eomday(yr,2)==29;
  댓글 수: 10
dpb
dpb 2019년 4월 5일
편집: dpb 2019년 4월 5일
t.AcctYear=fix((t.Date-t.Date(1))/duration(length(t.Date)*365,0,0));
Works by happenstance as written; it just happens that have the right numerical value for the duration interval--for the non-leap-year case it should just be 8760 hours; just happens that there were 24 elements in the array.
Posted the general solution as addendum...it's actually not as difficult to build as seemed after had a little sleep overnight... :)
dpb
dpb 2019년 4월 5일
Oh! Another way to approach it that is a little less exotic in having to think about how to combine months across calendar years would be with a timetable object and using timerange to pick out the time subsets by range of months by year. These could then be used to build the AccountingYear corollary variable.
Would be a little more actual code and would want to write a little function to do it, but it would let it be written using higher-level abstraction of the dates as recognizable dates rather than the admittedly somewhat arcane conversion by elapsed time above.
Of course, if the above were wrapped in a function, its interface could be more user-friendly as well and the messy stuff, done once, hidden from view.

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

Peter Perkins
Peter Perkins 2019년 4월 9일
I'm a little late to the party, but here's a version using rowfun and unstack. It doesn't require "balanced data", but maybe dpb's final version doesn't either I guess.
>> t = readtable('test.xlsx');
>> t.Code = categorical(t.Code); % probably want to enumerate the codes as cat names
>> edges = datetime(2017,9,1):calyears(1):datetime(2019,9,1); % or over whatever range you want
>> t.FinYear = discretize(t.Date,edges,'categorical',["y2017_2018" "y2018_2019"]);
>> revenue = rowfun(@(q,p) sum(q.*p), t, ...
'InputVariables',["Quantity" "Price"],'GroupingVariables',["FinYear" "Code"])
revenue =
6×4 table
FinYear Code GroupCount Var4
__________ ____ __________ ____
y2017_2018 abc 4 4
y2017_2018 def 4 6.2
y2017_2018 ghi 4 8
y2018_2019 abc 4 4.8
y2018_2019 def 4 6.4
y2018_2019 ghi 4 12
>> revenue = unstack(revenue,"Var4","FinYear",'GroupingVariables',"Code")
revenue =
3×3 table
Code y2017_2018 y2018_2019
____ __________ __________
abc 4 4.8
def 6.2 6.4
ghi 8 12
  댓글 수: 3
Steven Lord
Steven Lord 2019년 4월 9일
There's no way to redefine what the NEWTIMESTEP inputs like 'yearly' mean, but you can specify a NEWTIMES vector.
startTime = datetime('1 July 2015', 'InputFormat', 'd MMMM yyyy');
twentyFinancialYears = startTime + calyears(0:19).'
The "Interpolate Timetable Data to Time Vector" example on the retime documentation page shows specifying a specific vector of times, though the spacing there is 30 minutes rather than a year.
I used calyears instead of years in building twentyFinancialYears because otherwise the extra roughly quarter of a day and leap days would prevent the dates in twentyFinancialYears from landing exactly on midnight of July 1st in later years. Compare:
[startTime + years(0:19).', startTime + calyears(0:19).']
dpb
dpb 2019년 4월 9일
"no way to redefine what the NEWTIMESTEP inputs like 'yearly' mean, but you can specify a NEWTIMES vector..."
Yeah, that was one of my earlier comments on alternate approach.
OTOMH, doesn't seem like there would be any reason that one couldn't add that feature, though, of keeping the named divisions but simply provide them a reference start date.
Granted, it would be "syntactic sugar", but that's what the end purpose of Matlab is all about -- rapid development, not figuring out how to workaround things (even though we like ML mostly because it is fun to try to figure those things out, but we're not the majority of the users).

댓글을 달려면 로그인하십시오.


Philip Chindamo
Philip Chindamo 2019년 4월 10일
Hi everyone - thanks for the contributions and I'll take a look if it works for my data. However, more generally, why has it taken your time as 'experts' to help me on a most simple thing as starting a financial year on 1 July? Surely matlab should/could have this as a function? It's taken this long to wrangle the date field and I haven't even got to analysis yet. Is this the state of Matlab and data science in this day and age?
  댓글 수: 3
Peter Perkins
Peter Perkins 2019년 4월 10일
To me, the financial calendar aspect of this, while perhaps not immediately obvious, was actually the simpler part of what needed doing. If you had wanted grouped calculations by calendar year, you'd have used t.Date.Year as a grouping variable -- no problem. So to do it with a Sep-Aug financial calendar, it was just a matter of instead binning by Sep-Aug. That sort of calendar calculation would be good in an example. Thanks for pointing that out.
But I hear what you are saying, that financial calendars ought to be more "out of the box".
To me, the more difficult part was the reorientation of the result from the year/code grouped means. unstack is not widely known, but it's similar in spirit to a pivoting operation, and people may be more familiar with that terminology, and we are trying to make stack/unstack more accessible in the doc.
dpb
dpb 2019년 4월 12일
Well, even as much "time in grade" as I have (altho I don't use ML full time nor am I actually consulting any longer, it's been 30 yr or so), the newer features of the time handling are still somewhat of a learning excercise to apply in most efficient manner--for the new user coming in from a non-programming or other background I can understand it being very daunting to come up with the "how" of binning just staring at the doc's for the first time.
While the documentation is good in most cases, a lot of the examples are just trivial to the point of not being particularly instructional; cases that do require some other manipulation that provide insight to how to extend the use would be valuable.
I didn't go to see when unstack was introduced, but I've read the splitapply and findgroups doc and all those examples many times as well as followed links form the table and retime and friends and I don't recall ever having across it before...maybe it's in that generic list of functions with table that I also can't ever find when I'm looking for it directly...

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Calendar에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by