nchoose2: save output in chunks?

조회 수: 7 (최근 30일)
phlie
phlie 2016년 9월 19일
댓글: phlie 2016년 9월 22일
Hi everyone, I have a cell array N(m,n) with mixed numeric/ string (with the first row as a header). I would like to create combinations without repetition of every row i with every other row j ≠ i. I am doing this with user-written nchoose2.
ind = nchoose2(1:size(N, 1)-1);
Unfortunately, my cell array is too large so that ind generates an out-of-memory error. Can I save the output of nchoose2 (or I wouldn't mind using nchoosek) in chunks? Like save the first 50k rows of the ind, process them, delete them, and then turn to the next 50k?
  댓글 수: 2
José-Luis
José-Luis 2016년 9월 19일
Do you have any idea how large your total output would actually be?
Guillaume
Guillaume 2016년 9월 19일
For reference: nchoose2

댓글을 달려면 로그인하십시오.

채택된 답변

Guillaume
Guillaume 2016년 9월 19일
Neither nchoosek nor nchoose2 let you return a portion of the output.
You can always generate the output using a loop and break out whenever you want:
function [rowcombination, nextfirstrow, nextsecondrow] = choose2row(in, maxrows, startfirstrow, startsecondrow)
%CHOOSE2ROW create every combination of 2 rows of a matrix/cell array
%The function can return a portion of the output and be called again to return the next portion.
%The function uses double loops to compute all combinations.
%Outputs:
% rowcombination: matrix/cell array where each row is the concatenation of two distinct rows of the original matrix/cell array.
% nextfirstrow:
% nextsecondrow: parameters to pass back to a subsequent call to CHOOSE2ROW to return the next portion of row combination.
%Inputs:
% in: input matrix/cell array of size [m, n].
% maxrows: maximum number of rows of output rowcombination. Inf for no limit. Scalar, optional. default Inf.
% startfirstrow: outer loop start index. Scalar, optional. default 1.
% startsecondrow: inner loop start index. Scalar, optional. default startfirstrow - 1.
if nargin < 2 || maxrows == Inf
maxrows = Inf;
else
validateattributes(maxrows, {'numeric'}, {'scalar', 'positive', 'integer'}, 2);
end
if nargin < 3
startfirstrow = 1;
else
validateattributes(startfirstrow, {'numeric'}, {'scalar', 'positive', 'integer', '<', size(in, 1)}, 3);
end
if nargin < 4
startsecondrow = startfirstrow + 1;
else
validateattributes(startsecondrow, {'numeric'}, {'scalar', 'positive', 'integer', '<=', size(in, 1), '>', startfirstrow}, 4);
end
nrows = (size(in, 1) - startfirstrow + 1) * (size(in, 1) - startfirstrow) / 2 - (startsecondrow - startfirstrow - 1); %total size of output still to generate
rowcombination = repmat(in(1, :), min(nrows, maxrows), 2); %initialise output to required size
rowout = 1;
for nextfirstrow = startfirstrow : size(in, 1)-1
for nextsecondrow = startsecondrow : size(in, 1)
rowcombination(rowout, :) = [in(nextfirstrow, :), in(nextsecondrow, :)];
rowout = rowout + 1;
if rowout > maxrows
nextsecondrow = nextsecondrow + 1; %#ok<FXSET> exiting the loop
if nextsecondrow > size(in, 1)
nextfirstrow = nextfirstrow + 1; %#ok<FXSET>
nextsecondrow = nextfirstrow + 1; %#ok<FXSET>
if nextfirstrow == size(in, 1)
nextfirstrow = Inf; %#ok<FXSET>
nextsecondrow = Inf; %#ok<FXSET>
end
end
return
end
end
startsecondrow = nextfirstrow + 2;
end
nextfirstrow = Inf;
nextsecondrow = Inf;
end
Of course, you're trying performance for memory.
  댓글 수: 1
phlie
phlie 2016년 9월 22일
Thank you, Guillaume. This works very well!

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Argument Definitions에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by