Insert efficiently elements into sorted array

조회 수: 25 (최근 30일)
Manuel Barrio
Manuel Barrio 2018년 3월 8일
댓글: Haneesha Kommalapati 2021년 1월 31일
I want to insert elements into a sorted array -one per column- so that the result is also sorted. The following example is very simple but the matrices I am working with are quite big and hence my question is how to do it efficiently rather that how to do it.
A=[1 3 5 7; 2 3 6 7; 3 5 8 9]'
v=[6 1 6] --elements to be inserted
so that the result should be
B=[1 3 5 6 7; 1 2 3 6 7; 3 5 6 8 9]'
My first option was
B=sort([A; v])
but this takes very long with big matrices. Any optimization on this?
  댓글 수: 1
Jan
Jan 2018년 3월 8일
편집: Jan 2018년 3월 8일
As usual for optimizing code: The real dimensions matter. It is important, if you are working with [1e3 x 1e6] or [1e6 x 1e3] matrices.

댓글을 달려면 로그인하십시오.

채택된 답변

James Tursa
James Tursa 2018년 3월 8일
편집: James Tursa 2018년 3월 8일
Here is a mex routine to do the operation in case you are interested. It runs in about 1/2 the time of the m-code on my machine. (Sorry Jan ... I couldn't resist!)
/* --------------------------------------------------------------------
File: sorted_insert.c
sorted_insert inserts vector elements into a sorted matrix by columns
to result in a sorted matrix by columns. Syntax:
C = sorted_insert(A,B)
A = full real double 2D matrix that is sorted by columns (MxN)
B = full real double 1D vector that has same number of elements
as A has columns (1xN) or (Nx1)
C = Matrix A with B elements inserted in sorted fashion. I.e., if B is
a row vector then this is the equivalent of C = sort([A;B],1)
Programmer: James Tursa
Date: March 8, 2018
*/
/* Includes ----------------------------------------------------------- */
#include "mex.h"
/* Prototype in case this is earlier than R2015a */
mxArray *mxCreateUninitNumericMatrix(size_t m, size_t n,
mxClassID classid, mxComplexity ComplexFlag);
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize i, i1, i2, i3, j, m, n;
double *Apr, *Bpr, *Cpr;
if( nlhs > 1 ) {
mexErrMsgTxt("Too many outputs");
}
if( nrhs != 2 || !mxIsDouble(prhs[0]) || !mxIsDouble(prhs[1]) ||
mxIsComplex(prhs[0]) || mxIsSparse(prhs[0]) ||
mxIsComplex(prhs[1]) || mxIsSparse(prhs[1]) ) {
mexErrMsgTxt("Need two full real double inputs");
}
if( mxGetNumberOfDimensions(prhs[0]) != 2 ) {
mexErrMsgTxt("First input must be a 2D matrix");
}
m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
if( (mxGetM(prhs[1]) != 1 && mxGetN(prhs[1]) != 1) ||
mxGetNumberOfElements(prhs[1]) != n ) {
mexErrMsgTxt("Second input must be vector with same number of elements as columns of first input");
}
plhs[0] = mxCreateUninitNumericMatrix(m+1,n,mxDOUBLE_CLASS,mxREAL);
Apr = mxGetPr(prhs[0]);
Bpr = mxGetPr(prhs[1]);
Cpr = mxGetPr(plhs[0]);
for( j=0; j<n; j++ ) { /* For each column */
i1 = 0;
i3 = m;
while( i3-i1 > 1 ) { /* Binary search to find insert spot */
i2 = (i3 + i1) >> 1;
if( Apr[i2] > *Bpr ) {
i3 = i2;
} else {
i1 = i2;
}
}
if( i1 < m && Apr[i1] > *Bpr ) { /* Potential 1st spot adjustment */
i3--;
}
for( i=0; i<i3; i++ ) { /* Copy the stuff before */
*Cpr++ = *Apr++;
}
*Cpr++ = *Bpr++; /* Copy the vector element */
for( i=i3; i<m; i++ ) { /* Copy the stuff after */
*Cpr++ = *Apr++;
}
}
}
  댓글 수: 11
Manuel Barrio
Manuel Barrio 2018년 3월 12일
Thank you very much James. Hopefully this will save me a great amount of computation. Cheers,
Haneesha Kommalapati
Haneesha Kommalapati 2021년 1월 31일
hey hi need answer for modify INSERTION_SORT.m in mtalab to sort an input array A using binary search instead of linear search and compare the time complexities of both insertion sort Algorithms. please help these...

댓글을 달려면 로그인하십시오.

추가 답변 (3개)

Jan
Jan 2018년 3월 8일
편집: Jan 2018년 3월 8일
Actually a binary search in the subvectors is optimal. But in my experiments find was not slower even if the binary search was performed in a C-mex function.
A = [1 3 5 7; 2 3 6 7; 3 5 8 9]';
v = [6 1 6];
sA = size(A);
B = zeros(sA(1) + 1, sA(2));
for k = 1:sA(2)
index = find(A(:, k) > v(k), 1);
if isempty(index)
B(1:sA, k) = A(:, k);
B(sA+1, k) = v(k);
else
B(1:index - 1, k) = A(1:index - 1, k);
B(index, k) = v(k);
B(index + 1:sA+1, k) = A(index:sA, k);
end
end
Maybe FEX: insertrows solves the problem efficiently, but it works along the rows. Working along columns is faster.
Please post some timings with relevant input data.
  댓글 수: 3
Jan
Jan 2018년 3월 8일
편집: Jan 2018년 3월 8일
Do you need it faster? Then I could try a C-mex function. But actually I do not see a bottleneck caused by Matlab here, except that find(A(:,k)>a(w,k),1) might create a copy of A(:, k) without need.
The time for sorting grows super-linear with the size of the data, but find has a linear complexity only. Therefore the advantage of find might be growing with the data size. Is 1e5 x 1e2 a typical size?
Manuel Barrio
Manuel Barrio 2018년 3월 9일
I need it as fast as possible since my goal is to increase sA2 as much as possible --minimum 1e3 and ideally 1e4, and so far I cannot even reach 1e3. sA1 should always be in the range of 1e5

댓글을 달려면 로그인하십시오.


Manuel Barrio
Manuel Barrio 2018년 3월 8일
편집: Manuel Barrio 2018년 3월 8일
Additionally, if you wanted the result to be in A and decided to manipulate A directly with the following code
for k=1:sA2
index=find(A(:,k)>a(w,k),1);
if isempty(index)
A(sA1+1, k) = a(w,k);
else
A(index:sA1+1,k)=[a(w,k); A(index:sA1,k)];
end
end
Then performance drops considerably: Elapsed time is 15.218897 seconds.
  댓글 수: 2
Jan
Jan 2018년 3월 8일
Try to replace
A(index:sA1+1,k)=[a(w,k); A(index:sA1,k)];
by
A(index+1:sA1+1,k) = A(index:sA1,k);
A(index) = a(w,k);
With the first [a(w,k); A(index:sA1,k)] must be created explicitly, while it could be possible, that shifting the values in the 2nd method is done inplace.
Manuel Barrio
Manuel Barrio 2018년 3월 9일
It doesn't work. Same as before. Thanks anyhow for the suggestion

댓글을 달려면 로그인하십시오.


Bruno Luong
Bruno Luong 2018년 3월 12일
The best way to insert/delete into a sorted array is ... not using array at all, but a more sophisticated data structure, e.g. red-black tree.
Unfortunately MATLAB does not provide any efficient support for list/tree. One need to use C/C++ for efficient implementation.

카테고리

Help CenterFile Exchange에서 Shifting and Sorting Matrices에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by