# Finding Sequences of 1's values

조회 수: 33 (최근 30일)
James 2011년 9월 21일
답변: Jan 2022년 6월 30일
I have a matrix of values, for example: [0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 0 ...]
In practice this is a vector (1 row by somewhere around 80000 columns). I need to output a vector with the duration of these 'bursts' of 1's. For example, the output for the above would be: [3 2 5 ...]
Background: I have a time series where all the 1's are values above a threshold and all the 0's are values below a threshold. So by doing this I'm trying to investigate the duration of these extreme events.
Can anyone tell me how I can do this?

댓글을 달려면 로그인하십시오.

### 채택된 답변

Image Analyst 2011년 9월 21일
If you have the Image Processing Toolbox, just call bwconncomp() or bwlabel(), then call regionprops() asking for 'Area' and then extract the areas from the structure. Three lines.
labeledMatrix = bwlabel(data);
measurements = regionprops(labeledMatrix, 'Area');
allAreas = [measurements.Area];
##### 댓글 수: 1이전 댓글 -1개 표시이전 댓글 -1개 숨기기
Jan 2011년 9월 22일
data = round(rand(1, 80000))
regionprops (Image Analyst): 0.55 sec
Vectorized FINDSTR(DIFF): 0.0081 sec
Cleaned loop (Derek + Jan): 0.0048 sec
(Measured under Matlab 2009a, WinXP, 2.3GHz Core2Duo)

댓글을 달려면 로그인하십시오.

### 추가 답변 (8개)

Derek O'Connor 2011년 9월 21일
If you don't have any toolboxes then this plain Matlab function may help. It is based loosely on the WordCount procedure in Kernighan & Plauger [1981], Software Tools in Pascal, ©Bell Labs, Addison-Wesley, page 17.
For n = 10^8 it takes about 4 secs (single core) 2.3GHz, R2008a 64bit on Windows 7.
function [start, len, k1] = ZeroOnesCount(v)
%
% Determine the position and length of each 1-string in v,
% where v is a vector of 1s and 0s
% Derek O'Connor 21 Sep 2011
%
n = length(v);
start = zeros(1,n); % where each 1-string starts
len = zeros(1,n); % length of each 1-string
k1= 0; % count of 1-strings
inOnes = false;
for k = 1:n
if v(k) == 0 % not in 1-string
inOnes = false;
elseif ~inOnes % start of new 1-string
inOnes = true;
k1 = k1+1;
start(k1) = k;
len(k1) = 1;
else % still in 1-string
len(k1) = len(k1)+1;
end
end
%------------- ZeroOnesCount ----------------------------
>> v = [0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 0];
>> [start,len,k1] = ZeroOnesCount(v);disp([start(1:k1); len(1:k1)]);
7 15 21
3 2 5
##### 댓글 수: 1이전 댓글 -1개 표시이전 댓글 -1개 숨기기
Jan 2011년 9월 22일
To get the answer needed by the OP, collecting "start" is not needed, but occupies n*8 bytes in the memory.
But the idea to create a loop is efficient. +1

댓글을 달려면 로그인하십시오.

James 2011년 9월 21일
Thanks, that worked perfect!
##### 댓글 수: 0이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Derek O'Connor 2011년 9월 22일
This is a more general and simpler solution than my previous answer.
%-------------------------------------------------------
function [start, val, len, rn] = RunsCount(v)
%
% Determine the length of each val-run in v,
% where v is a vector of 'values'
% Example 1: RunsCount([2 2 2 1 1 9 8 8 3 3 3 3]) gives
% start 1 4 6 7 9
% val 2 1 9 8 3
% len 3 2 1 2 4
% Example 2: RunsCount(['aaaabbaacccbbb']);
% start 1 5 7 9 12
% val 97 98 97 99 98
% len 4 2 2 3 3
%
% Derek O'Connor 22 Sep 2011
%
n = length(v);
val = zeros(1,n); % run value
len = zeros(1,n); % run length
start = zeros(1,n); % pos. in v where run starts
start(1) = 1;
rk = 1; % number in run
rn = 1; % number of runs
for k = 2:n
if v(k) == v(k-1) % in run
rk = rk+1;
else % end of run
val(rn) = v(k-1);
len(rn) = rk;
rk = 1; % v(k) is start of
rn = rn+1; % next run
start(rn) = k; % position of start in v
end
end
val(rn) = v(n); % last run
len(rn) = n - sum(len);
%
%------------- End RunsCount ----------------------------
EDIT : Pre-allocated all output arrays. Changed loop counter from i to k.
##### 댓글 수: 2없음 표시없음 숨기기
Jan 2011년 9월 22일
Without pre-allocating "val", "len" and "start", this code will be very slow for a large input due to the huge memory footprint. Therefore I'd prefer your other solution.
Callula Killingly 2022년 6월 30일
Is there a way to modify this so that a particular sequence is identified? i.e., rather than consecutive numbers, can this function be adapted to search for the sequence [1 0] ?

댓글을 달려면 로그인하십시오.

Jan 2011년 9월 22일
A modified version of Derek's solution:
function len = CountOnes(v)
n = length(v);
len = zeros(1,n); % length of each 1-string
k1 = 0; % count of 1-strings
inOnes = false;
s = 0;
for k = 1:n
if v(k) % not in 1-string
s = s + 1; % increase the counter
inOnes = true;
elseif inOnes % leave the ones block
k1 = k1 + 1;
len(k1) = s;
s = 0; % reset the counter
inOnes = false;
end
end
len = len(1:k1);
And this is a vectorized approach:
function len = CountOnes_vectorized(v)
v = diff([0, x, 0]);
len = findstr(v, -1) - findstr(v, 1);
It looks nice, but is 45% slower than the loop method for the 80'000 elements needed by the OP.
[EDITED] New version with a nested loop:
function len = CountOnes2(v)
n = length(v);
len = zeros(1, ceil(n/2), 'uint32');
j = 0;
k = 1;
while k <= n
if v(k)
a = k;
k = k + 1;
while k <= n && v(k)
k = k + 1;
end
j = j + 1;
len(j) = k - a;
end
k = k + 1;
end
len = len(1:j);
##### 댓글 수: 5이전 댓글 3개 표시이전 댓글 3개 숨기기
Saida Nawrin 2022년 1월 14일
is there any way that the same can be done for a matrix? I have the similar situation but instead of a vector I have 864000 x 814 diamension matrix containing only 0 and 1. I just want to add 1's that are continuous like this problem. but the format is in matrix. Probably its an easy solution. sorry I am kind of new to matlab.
Image Analyst 2022년 1월 14일
My solution above (the top, accepted one) also works for matrices. If it doesn't read this
and post your image and question in a new question.

댓글을 달려면 로그인하십시오.

Derek O'Connor 2011년 9월 22일
Jan,
I'm using the Add-an-Answer window because the comment window is hard to use for all but short, text-only replies.
I posted an older version of RunsCount (no pre-allocation) by mistake. At that time I wasn't sure how long the output arrays should be. The size varies with the input but the worst is n, so pre-allocate the output arrays of length n.
I take your point that the array start, etc., are not necessary, but I suspect this information might be needed in other applications of these counters. Anyway, to compare your modification with ZeroOnesCount and RunsCount, I stripped out all but the output array len(1,n).
RC3 is RunsCount with start and val array and their ops stripped out
ZOC2 is ZeroOnesCount with start array and its ops stripped out.
COJ is CountOnes, Jan's modification of ZOCorig
The table below are the normalized average times, averaged over 100 runs The final row gives the actual average times for n = 10^8.
Normalized Average Times
n RC3 ZOCorig ZOC2 COJ
-----------------------------------------------
10^3 1.53 3.84 1 1.16
10^4 1.26 1.49 1 1.11
10^5 1.19 1.24 1 1.15
10^6 1.15 1.25 1 1.31
10^7 1.15 1.24 1 1.32
10^8 1.14 1.26 1 1.32
-----------------------------------------------
Actual(secs)10^8 3.84 4.23 3.36 4.44
I was surprised by this result: ZOC2 is uniformly better than COJ.
I profiled both functions with n = 10^8 and compared statement counts (timing info is ignored because it is useless for comparisons).
COJ ZOC2
1 10 for k = 1:n 1 11 for k = 1:n
100000000 11 if v(k) 100000000 12 if v(k) == 0
49999019 12 s = s + 1; 50000981 13 inOnes = false;
49999019 13 inOnes = true; 49999019 14 elseif ~inOnes
50000981 14 elseif inOnes 25002599 15 inOnes = true;
25002599 15 k1 = k1 + 1; 25002599 16 k1 = k1+1;
25002599 16 len(k1) = s; 25002599 17 len(k1) = 1;
25002599 17 s = 0; 24996420 18 else
25002599 18 inOnes = false; 24996420 19 len(k1) = len(k1)+1;
25002599 19 end 24996420 20 end
100000000 20 end 100000000 21 end
At first glance there seems to be very little difference in the counts: if-elseif about the same, inOnes = true-false, about the same.
The only difference I can see is that COJ does about 75x10^6 ops on the counter s (lines 12,17) which are not done by ZOC2. However, ZOC2 does about 25x10^6 ops extra on len (line 19) which are not done by COJ.
Thus COJ is doing about 50x10^6 extra assignment operations. This would seem to account for the difference between the two functions.
##### 댓글 수: 4이전 댓글 2개 표시이전 댓글 2개 숨기기
Walter Roberson 2011년 9월 23일
uint4 and uint1 are not defined by MATLAB.
Jan 2011년 9월 23일
편집: Jan 2022년 6월 30일
@Walter: I think Derek bantered me with the UINT1 because he thought, that I had changed the topic during the process of measurement.

댓글을 달려면 로그인하십시오.

Jos (10584) 2014년 3월 18일
Another idea, not tested for speed:
A = [0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 0]
B = cumsum(a)
tf = B(2:end) == B(1:end-1) & A(1:end-1)==1
result = diff([0 B(tf)])
##### 댓글 수: 0이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Jan 2018년 1월 8일
편집: Jan 2018년 1월 8일
[B, N] = RunLength(x)
Result = N(B == 1);
##### 댓글 수: 0이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Jan 2022년 6월 30일
Some timings of the methods provided in this thread:
x = randi([0, 1], 1, 80000);
format long g
disp(timeit(@() Jos(x)))
0.00079983
disp(timeit(@() CountOnes2(x)))
0.00087183
disp(timeit(@() CountOnes_vectorized(x)))
0.00074883
disp(timeit(@() CountOnes(x)))
0.00129683
disp(timeit(@() ZeroOnesCount(x)))
0.00138283
disp(timeit(@() RunsCount(x)))
0.00078483
disp(timeit(@() ImgProcMethod(x)))
0.05535983
function r = Jos(x)
B = cumsum(x);
tf = B(2:end) == B(1:end-1) & x(1:end-1);
r = diff([0 B(tf)]);
end
function len = CountOnes2(v)
n = length(v);
len = zeros(1, ceil(n/2), 'uint32');
j = 0;
k = 1;
while k <= n
if v(k)
a = k;
k = k + 1;
while k <= n && v(k)
k = k + 1;
end
j = j + 1;
len(j) = k - a;
end
k = k + 1;
end
len = len(1:j);
end
function len = CountOnes_vectorized(x)
v = diff([0, x, 0]);
len = findstr(v, -1) - findstr(v, 1);
end
function len = CountOnes(v)
n = length(v);
len = zeros(1,n); % length of each 1-string
k1 = 0; % count of 1-strings
inOnes = false;
s = 0;
for k = 1:n
if v(k) % not in 1-string
s = s + 1; % increase the counter
inOnes = true;
elseif inOnes % leave the ones block
k1 = k1 + 1;
len(k1) = s;
s = 0; % reset the counter
inOnes = false;
end
end
len = len(1:k1);
end
function len = RunsCount(v)
n = length(v);
len = zeros(1,n); % run length
rk = 1; % number in run
rn = 1; % number of runs
for k = 2:n
if v(k) == v(k-1) % in run
rk = rk+1;
else % end of run
len(rn) = rk;
rk = 1; % v(k) is start of
rn = rn+1; % next run
end
end
len(rn) = n - sum(len);
end
function len = ZeroOnesCount(v)
n = length(v);
len = zeros(1,n); % length of each 1-string
k1= 0; % count of 1-strings
inOnes = false;
for k = 1:n
if v(k) == 0 % not in 1-string
inOnes = false;
elseif ~inOnes % start of new 1-string
inOnes = true;
k1 = k1+1;
len(k1) = 1;
else % still in 1-string
len(k1) = len(k1)+1;
end
end
end
function allAreas = ImgProcMethod(data)
labeledMatrix = bwlabel(data);
measurements = regionprops(labeledMatrix, 'Area');
allAreas = [measurements.Area];
end

댓글을 달려면 로그인하십시오.

### 카테고리

Help CenterFile Exchange에서 Logical에 대해 자세히 알아보기

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by