I have the following matrix [t k p]
1.0000 1.0000 -1.1471
1.0000 2.0000 -1.0689
2.0000 1.0000 -0.8095
2.0000 2.0000 -2.9443
3.0000 1.0000 1.4384
3.0000 2.0000 0.3252
and I want an additional column with the mean of p for every t, hence
1.0000 1.0000 -1.1471 -1.1080
1.0000 2.0000 -1.0689 -1.1080
2.0000 1.0000 -0.8095 -1.8769
2.0000 2.0000 -2.9443 -1.8769
3.0000 1.0000 1.4384 0.8818
3.0000 2.0000 0.3252 0.8818
I can do it with the following code
if true
%Calulate the mean
A=[t p_tk];
p_t= accumarray(A(:,[1]), A(:,2), [], @nanmean, NaN);
% allocate it to long form
p_t_long= NaN(size(t));
for d = 1:max(t)
ind= t ==d;
p_t_long(ind)= p_t(d);
end
end
However, I want to avoid loops since I have a big dataset. Can anybody help?

 채택된 답변

Stephen23
Stephen23 2018년 11월 8일
편집: Stephen23 2018년 11월 8일

1 개 추천

Some indexing using the first column does what you want, more efficiently than a loop or unique:
>> M = [1,1,-1.1471;1,2,-1.0689;2,1,-0.8095;2,2,-2.9443;3,1,1.4384;3,2,0.3252]
M =
1.00000 1.00000 -1.14710
1.00000 2.00000 -1.06890
2.00000 1.00000 -0.80950
2.00000 2.00000 -2.94430
3.00000 1.00000 1.43840
3.00000 2.00000 0.32520
>> V = accumarray(M(:,1),M(:,3),[],@mean)
V =
-1.10800
-1.87690
0.88180
>> M(:,4) = V(M(:,1))
M =
1.00000 1.00000 -1.14710 -1.10800
1.00000 2.00000 -1.06890 -1.10800
2.00000 1.00000 -0.80950 -1.87690
2.00000 2.00000 -2.94430 -1.87690
3.00000 1.00000 1.43840 0.88180
3.00000 2.00000 0.32520 0.88180

댓글 수: 3

Thank you for your quick answer, it works perfectly, however, I abbreviated my question, because I thought it would work the same way. What if I have 3 or more groups and want to calculate the mean for each group and reassign it?
M =
1.0000 1.0000 1.0000 -5.2359
1.0000 1.0000 2.0000 -12.7706
1.0000 2.0000 1.0000 8.0797
1.0000 2.0000 2.0000 5.8198
2.0000 1.0000 1.0000 3.3348
2.0000 1.0000 2.0000 -10.3368
2.0000 2.0000 1.0000 14.2749
2.0000 2.0000 2.0000 6.5018
3.0000 1.0000 1.0000 0.0093
3.0000 1.0000 2.0000 3.2289
3.0000 2.0000 1.0000 0.3800
3.0000 2.0000 2.0000 -14.5021
and I want
V =
1.0000 1.0000 1.0000 0.1435 -2.5851
1.0000 1.0000 2.0000 -5.3137 -2.5851
1.0000 2.0000 1.0000 -6.7921 -7.6780
1.0000 2.0000 2.0000 -8.5640 -7.6780
2.0000 1.0000 1.0000 -2.3356 -9.6810
2.0000 1.0000 2.0000 -17.0264 -9.6810
2.0000 2.0000 1.0000 12.6423 10.4214
2.0000 2.0000 2.0000 8.2006 10.4214
3.0000 1.0000 1.0000 2.7997 2.7260
3.0000 1.0000 2.0000 2.6523 2.7260
3.0000 2.0000 1.0000 -4.9816 4.1026
3.0000 2.0000 2.0000 13.1869 4.1026
Stephen23
Stephen23 2018년 11월 12일
편집: Stephen23 2018년 11월 12일
>> M = [1,1,1,0.1435;1,1,2,-5.3137;1,2,1,-6.7921;1,2,2,-8.5640;2,1,1,-2.3356;2,1,2,-17.0264;2,2,1,12.6423;2,2,2,8.2006;3,1,1,2.7997;3,1,2,2.6523;3,2,1,-4.9816;3,2,2,13.1869]
M =
1.00000 1.00000 1.00000 0.14350
1.00000 1.00000 2.00000 -5.31370
1.00000 2.00000 1.00000 -6.79210
1.00000 2.00000 2.00000 -8.56400
2.00000 1.00000 1.00000 -2.33560
2.00000 1.00000 2.00000 -17.02640
2.00000 2.00000 1.00000 12.64230
2.00000 2.00000 2.00000 8.20060
3.00000 1.00000 1.00000 2.79970
3.00000 1.00000 2.00000 2.65230
3.00000 2.00000 1.00000 -4.98160
3.00000 2.00000 2.00000 13.18690
>> [~,~,idx] = unique(M(:,1:end-2),'rows'); % indices of row groups.
>> V = accumarray(idx,M(:,end),[],@mean); % mean of each group.
>> M(:,5) = V(idx)
M =
1.00000 1.00000 1.00000 0.14350 -2.58510
1.00000 1.00000 2.00000 -5.31370 -2.58510
1.00000 2.00000 1.00000 -6.79210 -7.67805
1.00000 2.00000 2.00000 -8.56400 -7.67805
2.00000 1.00000 1.00000 -2.33560 -9.68100
2.00000 1.00000 2.00000 -17.02640 -9.68100
2.00000 2.00000 1.00000 12.64230 10.42145
2.00000 2.00000 2.00000 8.20060 10.42145
3.00000 1.00000 1.00000 2.79970 2.72600
3.00000 1.00000 2.00000 2.65230 2.72600
3.00000 2.00000 1.00000 -4.98160 4.10265
3.00000 2.00000 2.00000 13.18690 4.10265
Rahel Braun
Rahel Braun 2018년 11월 12일
Thank you!

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

Bruno Luong
Bruno Luong 2018년 11월 8일

1 개 추천

A=[...
1.0000 1.0000 -1.1471
1.0000 2.0000 -1.0689
2.0000 1.0000 -0.8095
2.0000 2.0000 -2.9443
3.0000 1.0000 1.4384
3.0000 2.0000 0.3252 ]
[~,~,J] = unique(A(:,1));
p_t= accumarray(J, A(:,3), [], @(x) mean(x,'omitnan'), NaN);
[A p_t(J)]
Result
ans =
1.0000 1.0000 -1.1471 -1.1080
1.0000 2.0000 -1.0689 -1.1080
2.0000 1.0000 -0.8095 -1.8769
2.0000 2.0000 -2.9443 -1.8769
3.0000 1.0000 1.4384 0.8818
3.0000 2.0000 0.3252 0.8818

댓글 수: 5

Bruno Luong
Bruno Luong 2018년 11월 8일
편집: Bruno Luong 2018년 11월 8일
The old loyal ACCUMARRAY is still the king of the speed
n = 1e6;
ntest = 10;
time = zeros(2,ntest);
for i = 1:ntest
g = ceil(100*rand(n,1));
t = rand(n,1);
tic
grpAvg1 = splitapply(@mean,t,g);
time(1,i) = toc;
tic
grpAvg2 = accumarray(g,t)./accumarray(g,1);
time(2,i) = toc;
end
time = mean(time,2);
fprintf('splitapply time = %f s\n', time(1)); % splitapply time = 0.158369 s
fprintf('accumarray time = %f s\n', time(2)); % accumarray time = 0.029645 s
Rahel Braun
Rahel Braun 2018년 11월 12일
Thank you for your answer, and yes accumarray is great! with a loop it took me almost a whole afternoon to calculate means and now it takes only a minute! However, I abbreviated my question, because I thought it would work the same way. Do you know how I can solve the problem if I have 3 or more groups and want to calculate the mean for each group and reassign it? Like in the comment above https://ch.mathworks.com/matlabcentral/answers/428818-allocate-values-avoiding-loop#comment_636183
Bruno Luong
Bruno Luong 2018년 11월 12일
편집: Bruno Luong 2018년 11월 12일
Unless I misunderstood something, assignment with the mean already answered, look at the last statement
[A p_t(J)]
where p_t is accumarray output (average) and J is the third output of UNIQUE.
Assign it to A if you like.
Rahel Braun
Rahel Braun 2018년 11월 12일
My problem was that I didn't know how to use unique() properly with 3 groups, and then the accumarray gave me a 3*2 matrix and I couldn't assign that. But thank you anyway.
My problem was that I didn't know how to use unique() properly with 3 groups,
Stephen already answered by just add 'ROWS' argument, to have one identification (third output) by for each 1x3 row (your "groups").
BTW, you might not noticed by using
accumarray(...,data) ./ accumarray(...,1)
is always fater than
accumarray(...,data, ..., @mean)
if speed is matter for you.

댓글을 달려면 로그인하십시오.

Cris LaPierre
Cris LaPierre 2018년 11월 8일

1 개 추천

Consider using the splitapply function. Assuming you have variable t, k and p already defined:
grpAvg = splitapply(@mean,p,t);
pAvg = grpAvg(t);
[t k p pAvg]

댓글 수: 2

Cris LaPierre
Cris LaPierre 2018년 11월 8일
편집: Cris LaPierre 2018년 11월 8일
If your grouping variable is not as clean as it is in this example, you can use the findgroups function to create an index of the unique values in your grouping variable.
Using your updated matrix from a comment, here is a robust way to achieve what you want using findgroups and splitapply (assuming variable t,k,l, and p exist and represent the columns of M):
M = [t k l p]
G = findgroups(t,k);
grpAvg = splitapply(@mean,p,G);
pAvg = grpAvg(G);
V = [t k l p pAvg]
The original matrix M is
M =
1.0000 1.0000 1.0000 0.1435
1.0000 1.0000 2.0000 -5.3137
1.0000 2.0000 1.0000 -6.7921
1.0000 2.0000 2.0000 -8.5640
2.0000 1.0000 1.0000 -2.3356
2.0000 1.0000 2.0000 -17.0264
2.0000 2.0000 1.0000 12.6423
2.0000 2.0000 2.0000 8.2006
3.0000 1.0000 1.0000 2.7997
3.0000 1.0000 2.0000 2.6523
3.0000 2.0000 1.0000 -4.9816
3.0000 2.0000 2.0000 13.1869
And resulting matrix V is
V =
1.0000 1.0000 1.0000 0.1435 -2.5851
1.0000 1.0000 2.0000 -5.3137 -2.5851
1.0000 2.0000 1.0000 -6.7921 -7.6780
1.0000 2.0000 2.0000 -8.5640 -7.6780
2.0000 1.0000 1.0000 -2.3356 -9.6810
2.0000 1.0000 2.0000 -17.0264 -9.6810
2.0000 2.0000 1.0000 12.6423 10.4215
2.0000 2.0000 2.0000 8.2006 10.4215
3.0000 1.0000 1.0000 2.7997 2.7260
3.0000 1.0000 2.0000 2.6523 2.7260
3.0000 2.0000 1.0000 -4.9816 4.1026
3.0000 2.0000 2.0000 13.1869 4.1026

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기

질문:

2018년 11월 8일

댓글:

2018년 11월 12일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by