Removing Short Runs from Binary Data

조회 수: 2 (최근 30일)
Jim McIntyre
Jim McIntyre 2020년 2월 19일
편집: Image Analyst 2020년 2월 20일
I have a large string of binary data of the form:
A = [0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,0]
Within the data, if I have a group of 0s with an occasional 1, I want to convert that 1 to a zero. Similarly for a group of 1s with an occasional 0.
As a rule, I want to reset runs of 1s or 0s that are shorter than 3 consecutive values in length to the value of the surrounding elements.
So 0,0,0,1,0,0,0 would become 0,0,0,0,0,0,0
I'd also like to convert something like 1,1,1,0,0,1,0,1,1 to all 1s.
Any suggestions on how to do this? Thanks in advance.
  댓글 수: 2
Jacob Wood
Jacob Wood 2020년 2월 19일
How important is speed here? Would you prefer a readable for-loop solution or a one-liner?
Guillaume
Guillaume 2020년 2월 19일
Jim McIntyre's comment mistakenly posted as an answer moved here:
Obviously a one-liner would be better, but a for-loop solution is probably okay.

댓글을 달려면 로그인하십시오.

채택된 답변

Image Analyst
Image Analyst 2020년 2월 19일
편집: Image Analyst 2020년 2월 20일
There is a built-in function for this, if you have the Image Processing Toolbox. Two functions actually. You can use bwareafilt() or bwareaopen(). Try it.
[EDIT]: OK, here is the code:
A = [0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,0]
% Case 1: Get rid of stretches of 1's shorter than 3.
A2 = bwareaopen(A, 3)
% Case 2: Get rid of stretches of 0's shorter than 3.
A = [1,1,1,0,0,1,0,1,1]
A3 = ~bwareaopen(~A, 3)
For Case 3: Both cases: get rid of stretches of 1's shorter than 3 AND runs of 0's shorter than 3, it depends on the order in which you do the operations. For example, what does [1, 1, 0, 1, 1 , 0, 0, 1, 1] become? All 1's or all zeros?

추가 답변 (2개)

Guillaume
Guillaume 2020년 2월 19일
The desired one-liner:
%demo data
A = [1 1 1 0 0 1 1 1 1 0 0 0 0 1 1 0 0 0 1 0 0 0]
%should result in
% [1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0]
double(regexprep(char(A), '(.)((??@char(1-$1)){1,2})\1', '$1${char(1-$2)}$1')) %replace a run of one to twp 0 or 1 surrounded by the opposite by a run of the opposite
However, note that behaviour may not be as you expect when you've got consecutive runs of 0s and 1s both less than 3 characters, as in your 2nd example [1,1,1,0,0,1,0,1,1]. Why is it the 0s that are replaced by 1s rather than the single 1 replaced by a 0?
  댓글 수: 1
Jim McIntyre
Jim McIntyre 2020년 2월 19일
Good question, Guillaume.
So, perhaps, this should be a two or three pass solution:
1) Pass 1 would replace 00010 with 00000 and 11101 with 11111.
2) Pass 2 would replace 001100 with 000000 and 110011 with 111111.
3) ...
Obviously, I need to give this a bit more thought.

댓글을 달려면 로그인하십시오.


Jacob Wood
Jacob Wood 2020년 2월 19일
I've got a silly one-liner:
A = [0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,0];
A_converted = replace(sprintf('%d', A),{'010','0110','01110','101','1001','10001'},{'000','0000','00000','111','1111','11111'}) - '0';
  댓글 수: 1
Guillaume
Guillaume 2020년 2월 19일
편집: Guillaume 2020년 2월 19일
You could just do char(A + '0') to construct the char vector instead of using sprintf.
This is arguably clearer than my regexprep solution. However, the regexprep expression can easily be extended to any arbitrary length of runs (simply replace the 2 in {1, 2} by whatever max run length is desired) whereas the replace would get a bit unwieldy.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Characters and Strings에 대해 자세히 알아보기

태그

제품


릴리스

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by