Find and reduce a numeric array with identical columns

조회 수: 5 (최근 30일)
John Smith
John Smith 2018년 12월 30일
편집: John Smith 2019년 1월 3일
Dear Sir/Madam,
I would like to ask you the following question:
I have a data file like this
tmp = [...
121 12 6914 0.5625
122 -48 6853 0.29688
119 48 6914 0.17188
125 -12 6853 0.078125
125 4 6853 0.4375
119 5 6832 0.20313
119 4 6832 0.039063
119 -4 6832 0.023438]
I would like re-group (or reduce) it with following conditions:
For any row, if column 1 AND column 3 of this row is identical with any column 1 AND column 3 of any other row. Then reduce to one new row with new value of column 2, this new value of column 2 is the sum of original values of column 2. Column 1 is kept the same, Column 4 is not important.
So, for above data, I expect to have the answer:
119 5 6832 0.20313 % 5+4-4=5
122 -48 6853 0.29688
125 -8 6853 0.4375 % -12+4=-8
121 12 6914 0.5625
119 48 6914 0.17188
What Matlab command to use? I would greatly appreciate it if you left your code and running output.
I am using MATLAB R2014a.
Thank you very much
  댓글 수: 3
Image Analyst
Image Analyst 2018년 12월 30일
I was wondering the same thing. Hopefully the order doesn't matter. I'm sure you could write the code afterwards in such a ways that it didn't matter.
John Smith
John Smith 2018년 12월 30일
편집: John Smith 2018년 12월 30일
In tmp (data file ), the order of the rows were randomly inputed by hand, no order at all.

댓글을 달려면 로그인하십시오.

채택된 답변

Stephen23
Stephen23 2018년 12월 30일
>> [~,X,Y] = unique(tmp(:,[1,3]),'rows');
>> out = tmp(X,:);
>> out(:,2) = accumarray(Y,tmp(:,2),[],@sum)
out =
119.000000 5.000000 6832.000000 0.023438
119.000000 48.000000 6914.000000 0.171880
121.000000 12.000000 6914.000000 0.562500
122.000000 -48.000000 6853.000000 0.296880
125.000000 -8.000000 6853.000000 0.437500
  댓글 수: 7
Stephen23
Stephen23 2019년 1월 1일
편집: Stephen23 2019년 1월 1일
Replace the line with accumarray with these three lines:
S = max([X1,X3]);
C = cell(S);
C(sub2ind(S,X1,X3)) = baz;
and a Happy New Year!
John Smith
John Smith 2019년 1월 2일
편집: John Smith 2019년 1월 3일
Dear Stephen,
By changing those three lines, you code works.
Back to the question in the very beginning, I said " Column 4 is not important". but now, I need to treat column 4 the same way as column 2:
tmp =
121 12 6914 0.5625
122 -48 6853 0.29688
119 48 6914 0.17188
125 -12 6853 0.078125
125 4 6853 0.4375
119 5 6832 0.20313
119 4 6832 0.039063
119 -4 6832 0.023438
out =
119 5 6832 0.265631 (%=0.20313+0.039063+0.023438)
119 48 6914 0.17188
121 12 6914 0.5625
122 -48 6853 0.29688
125 -8 6853 0.515625 (%=0.078125+0.4375)
how do you change your three line code:
[~,X,Y] = unique(tmp(:,[1,3]),'rows');
out = tmp(X,:);
out(:,2) = accumarray(Y,tmp(:,2),[],@sum)
I tried to modify the following way, it did not work:
out(:,2) = accumarray(Y,tmp(:,2), tmp(:,4), [],@sum)
However, when I use two lines (two time) then it worked:
out(:,2) = accumarray(Y,tmp(:,2),[],@sum)
out(:,4) = accumarray(Y,tmp(:,4),[],@sum)
Thank you very much.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Image Analyst
Image Analyst 2018년 12월 30일
편집: Image Analyst 2018년 12월 30일
What about using grpstats(), if you have the Statistics and Machine Learning Toolbox.
tmp = [...
121 12 6914 0.5625
122 -48 6853 0.29688
119 48 6914 0.17188
125 -12 6853 0.078125
125 4 6853 0.4375
119 5 6832 0.20313
119 4 6832 0.039063
119 -4 6832 0.023438]
col5 = 10000*tmp(:, 1) + tmp(:, 3)
tmp = [tmp, col5];
% No sum in grpstats, so have to do it twice.
% Once to get the mean and once to get the count.
outputMean = grpstats(tmp, tmp(:, 5), 'mean')
outputNumel = grpstats(tmp, tmp(:, 5), 'numel')
% Crop off temporary 5th column
output = outputMean(:, 1:4) % Initialize
% Column 2 is the sum = mean * count
output(:, 2) = outputMean(:, 2) .* outputNumel(:, 2)
The output seems to be sorted by the first column though:
output =
119 5 6832 0.088544
119 48 6914 0.17188
121 12 6914 0.5625
122 -48 6853 0.29688
125 -8 6853 0.25781
That might be a problem for you. I'm not sure. Of course column 4 can be cropped off or ignored since you say it's not important.

카테고리

Help CenterFile Exchange에서 Logical에 대해 자세히 알아보기

제품


릴리스

R2014a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by