Find contiguous values in vector that sum to specified value

조회 수: 2 (최근 30일)
Swisslog
Swisslog 2013년 1월 21일
Hi all,
I have a column vector e.g. V=[1 0 2 2 1 1]
I need to look within this vector and find a run of contiguous values which sum to a specific value (e.g. 5). If such a value can be found, I want the code to simply return the number of contiguous values it needed to add up to reach that target value.
So in the example above, I can see that I can add [1 0 2 2] to get 5, so that's 4 contiguous values summed, I can also see I can get 5 by adding [2 2 1] (3 values summed).
One way of doing this is applying a moving window which goes through the data summing the contents of the window. Every time the sum of the window equals my target value, that counts as a hit. The window then increases in size (1:length(V)).
The big problem I can see with this is computationally this is likely to be painfully slow because my length(V)=10^6. It's a million window sizes, and up to a million window shifts for each window
I wondered if anyone out there can think of a better way of achieving my aim?
  댓글 수: 1
Roger Stafford
Roger Stafford 2013년 1월 21일
I have two questions to ask which are important in choosing an efficient algorithm. 1) Are all your V values integers as in your example or is there a concern with round-off errors in checking sums? 2) Are all the values non-negative?

댓글을 달려면 로그인하십시오.

채택된 답변

Image Analyst
Image Analyst 2013년 1월 21일
편집: Image Analyst 2013년 1월 21일
You can use conv() to do the sums very efficiently. Try this:
tic
% Make a vector a million long.
v = randi(3, 1, 1000000);
for windowSize = 1 : 11;
sums = conv(v, ones(1, windowSize), 'same');
indexes = find(sums == 5);
fprintf('Number of 5s with window size = %d = %d\n', windowSize, length(indexes));
end
toc;
In the command window:
Number of 5s with window size = 1 = 0
Number of 5s with window size = 2 = 222880
Number of 5s with window size = 3 = 222507
Number of 5s with window size = 4 = 49205
Number of 5s with window size = 5 = 4065
Number of 5s with window size = 6 = 1
Number of 5s with window size = 7 = 0
Number of 5s with window size = 8 = 0
Number of 5s with window size = 9 = 0
Number of 5s with window size = 10 = 0
Number of 5s with window size = 11 = 0
Elapsed time is 0.343283 seconds.
Note, that some stretches of a window size might be the same stretch as a longer window if zeros are included. For example a window size of 2 with [2 3] and a window size of 3 with [2 3 0] and a window size of 4 with [2 3 0 0], etc. If you want to exclude longer window sizes from looking at the same stretch then you have to zero out the array where the smaller window was. Counting the lengths of the stretches is pretty easy if you have the Image Processing Toolbox. Just label and call regionprops at the end of each iteration. If you have that toolbox and can't figure it out, let me know. There may also be overlaps, for example in the sequence [2 3 2] you can get a sum of 5 at two positions, [2 3] and [3 2] so the 3 is in an overlap region.
  댓글 수: 8
Matt J
Matt J 2013년 1월 21일
편집: Matt J 2013년 1월 21일
A more efficient way of doing this
indexes = find(sums>3 & sums<5);
Output(i)=length(indexes);
is the 1-liner
Output(i) = nnz(sums>3 & sums<5);
See also my Answer, however. Repeated convolution on increasingly larger windows surely performs many redundant summations as compared to working with cumsum(V). Also, discarding so many convolution points might be less efficient than what IPDM does.
Swisslog
Swisslog 2013년 1월 21일
Thanks Matt, I will investigate. Thanks Image Analyst as well :-)

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Matt J
Matt J 2013년 1월 21일
편집: Matt J 2013년 1월 21일
Another approach might be to apply the IPDM tool
to the vector
Vc=cumsum(V(:));
Basically, every difference
Vc(i)-Vc(j)
corresponds to a sum over a particular window. The IPDM tool will compute these differences, but gives various options to restrict the search efficiently to certain component differences of interest. In your case, you might use
out = ipdm(Vc,'Subset','Maximum',...
'Limit',targetvalue+tolerance,...
'Result','Structure');
idx= (out.distance <= targetvalue-tolerance )&...
(out.rowindex > out.columnindex);
out.rowindex(idx)=[];
out.columnindex(idx)=[];
WindowWidths = out.columnindex-out.rowindex+1;

카테고리

Help CenterFile Exchange에서 Matrix Indexing에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by