Conditional average (need help with speed)

Question

0 개 추천

I have a table that looks like this:

country_id   year      M       T      average_T
          2000      10      76     NaN 
          2001      5       39     Mean of 76 and 62
          2002      NaN     37     Mean of 39 =39
          2003      15      5      NaN
          2004      10      28     Mean of 5 and 2
          2005      10      8      Mean of 8=8
          1999      15      1      NaN
          2000      10      62     Mean of 1=1
          2001      20      32     Mean of 76 and 62
          2002      10      72     Mean of 32=32
          2003      15      2      Mean of 5 and 2

I want to calculate the column average_T which is last year's average of the T values for the cases that have the same year and M value. (First entry for each id is NaN because we don't know past year's T for those entries)

I have written a code that can do this but it is impossible to run with my big data set:

mytable.average_T=NaN(N,1);
for k=2:N 
    if mytable{k,'country_id'} == mytable{k-1,'country_id'} 
        mytable.average_T(k,1)= mean(T(mytable.M==mytable.M(k-1)& ...
            mytable.year==mytable.year(k-1)), 'omitNaN');
     end
end

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

dpb 2021년 1월 16일

편집: dpb 2021년 1월 17일

MATLAB Online에서 열기

0 개 추천

Grouping variables and rowfun to the rescue...

tMeans=rowfun(@(x),mean(x,'omitnan'),mytable,'InputVariables','T','GroupingVariables',{'year','M'});

댓글 수: 11
이전 댓글 9개 표시 이전 댓글 9개 숨기기

dpb 2021년 1월 17일

편집: dpb 2021년 1월 17일

It makes no sense, no. You either compute average over each country ID as a group as well or you don't group countries -- if you keep the country id then that is the ID of the group; if you don't use countries as a grouping variable then there is no way to associate any given order of the contributing elements that made up that average to the average itself; that is gone.

As noted in the other Q? of the same subject, you could keep a set of which countries were include in the averaging, but that's all that is, there's no order to associate with the mean.

Or in a similar vein as in the other Q? comment you could assign an auxiliary variable that is the row in the table that is passed through the function and kept with the group that would identify the members of the group but again while that could be sorted, other than it is the identification of who is in the group, there's no meaning in the order in the computed mean.

BTW, this last id would just be the grouping index you could get from findgroups; it may be that the information contained from it is what you're actually looking for here, but the request as couched just doesn't make sense.

As for the previous year thing, you can simply associate the computed average of the year with the previous year after the fact or create another year variable that is the actual year+1 to use as the grouping variable instead.

Mia Dier 2021년 1월 17일

Amazing thank you! :)

dpb 2021년 1월 17일

NB: You could do the same thing with findgroups and splitapply without building the output table from rowfun, too.

댓글을 달려면 로그인하십시오.

Conditional average (need help with speed)

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 11
이전 댓글 9개 표시 이전 댓글 9개 숨기기

추가 답변 (0개)

카테고리

태그

Community Treasure Hunt

Conditional average (need help with speed)

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 11 이전 댓글 9개 표시 이전 댓글 9개 숨기기

추가 답변 (0개)

카테고리

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 11
이전 댓글 9개 표시 이전 댓글 9개 숨기기