How can I merge similar rows in a matrix based on the first three columns' value.

Question

Shayan Taheri 2022년 1월 2일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1621075-how-can-i-merge-similar-rows-in-a-matrix-based-on-the-first-three-columns-value

댓글: Voss 2022년 1월 3일

I have a very big matrix with 4 columns. The first three columns are coordinates of a point in a discrete 3D space, and the last column is the weight of that point. For example:

A = [1,1,1,0.2; 1,1,2,0.9; 1,2,1,1.2; ...]

Some of the coordinates, however, are duplicates with different weights. For example I might have:

A = [1,1,1,0.2; 1,1,1,2.3; 1,1,2,-0.3; ...]

What I want to achieve is to remove the duplicate coordinates, and use the mean of their weights as the weight for that coordinate. For example, after this operation, the last example will become:

A_new = [1,1,1,1.25; 1,1,2,-0.3; ...]

I have already written a code and it works is:

A_new = unique(A(:,1:3),"rows");
A_new = [A_new zeros(length(A_new),1)];
for i = 1:length(A_new)
    coord = A_new(i,1:3);
    dups = A(all(A(:,1:3)==coord,2), 4);
    A_new(i,4) = mean(dups);
end

But it is very slow for large matrix (e.g., 1000000 rows). Can I optimize this code in anyway?

Thank you in advance.

Shayan

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Cris LaPierre 2022년 1월 2일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1621075-how-can-i-merge-similar-rows-in-a-matrix-based-on-the-first-three-columns-value#answer_866790

MATLAB Online에서 열기

Use groupsummary. Group by the first 3 columns, and use 'mean' to determine the value of the fourth. I find it easier to use on tables, so I convert A to a table first.

A = [1,1,1,0.2; 1,1,1,2.3; 1,1,2,-0.3];
A = array2table(A);
B = groupsummary(A,1:3,'mean',4)
B = 2×5 table
    A1    A2    A3    GroupCount    mean_A4
    __    __    __    __________    _______

    1     1     1         2          1.25  
    1     1     2         1          -0.3  

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 2

Voss 2022년 1월 2일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1621075-how-can-i-merge-similar-rows-in-a-matrix-based-on-the-first-three-columns-value#answer_866795

MATLAB Online에서 열기

Generate some random data mimicking your situation:

[X,Y,Z] = ndgrid(1:2,1:3,1:2);
A = [X(:) Y(:) Z(:) rand(numel(X),1)];
A(:,3) = 1;
disp(A);
0000    1.0000    1.0000    0.5270
0000    1.0000    1.0000    0.5825
0000    2.0000    1.0000    0.3314
0000    2.0000    1.0000    0.3005
0000    3.0000    1.0000    0.4200
0000    3.0000    1.0000    0.1138
0000    1.0000    1.0000    0.1304
0000    1.0000    1.0000    0.1349
0000    2.0000    1.0000    0.4907
0000    2.0000    1.0000    0.8794
0000    3.0000    1.0000    0.3126
0000    3.0000    1.0000    0.2244

Use a loop like yours but comparing indices:

[A_new,~,ii] = unique(A(:,1:3),'rows');
A_new = [A_new zeros(size(A_new,1),1)];
for i = 1:size(A_new,1)
    A_new(i,4) = mean(A(ii == i,4));
end
disp(A_new);
    1.0000    1.0000    1.0000    0.3287
    1.0000    2.0000    1.0000    0.4111
    1.0000    3.0000    1.0000    0.3663
    2.0000    1.0000    1.0000    0.3587
    2.0000    2.0000    1.0000    0.5899
    2.0000    3.0000    1.0000    0.1691

Or do the same thing with arrayfun():

[A_new,~,ii] = unique(A(:,1:3),'rows');
A_new(:,end+1) = arrayfun(@(i)mean(A(ii == i,4)),1:size(A_new,1));
disp(A_new);
    1.0000    1.0000    1.0000    0.3287
    1.0000    2.0000    1.0000    0.4111
    1.0000    3.0000    1.0000    0.3663
    2.0000    1.0000    1.0000    0.3587
    2.0000    2.0000    1.0000    0.5899
    2.0000    3.0000    1.0000    0.1691

댓글 수: 2
없음 표시없음 숨기기

Shayan Taheri 2022년 1월 2일

Thank you very much for your suggestion. This method was definately cleaner than my code, though it wasnt't much different in terms of speed. I ran it for an array of 454000 rows and the processing time was 409 seconds. The other solution based on groupsummary achieved 14 seconds.

Voss 2022년 1월 3일

Good to know. I wasn't sure either of these ways would be much different than what you had in terms of speed.

댓글을 달려면 로그인하십시오.

How can I merge similar rows in a matrix based on the first three columns' value.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

추가 답변 (1개)

댓글 수: 2
없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

How can I merge similar rows in a matrix based on the first three columns' value.

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

추가 답변 (1개)

댓글 수: 2 없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2
없음 표시없음 숨기기