Extract and save data between two regions

조회 수: 5 (최근 30일)
jay yawson
jay yawson 2019년 6월 11일
댓글: jay yawson 2019년 6월 12일
Hello everyone,
I have data that I have manually labeled. However this labeling is very time consuming and I believe there's a way to automatically label and extract these portions of data for further processing? I have would like to extract the data between the two lines without overlaps. Can anyone see some specific pattern at the highlited points that we can use code to automatically extract?
Really appreciate any help. Thanks so much
Capture.PNG
  댓글 수: 4
Adam Danz
Adam Danz 2019년 6월 12일
I thought you were trying to segment the black circled regions and that's what my solution aims at. The segmentation you described above I think is an easier problem.
jay yawson
jay yawson 2019년 6월 12일
Hello Adam, I realized when I worked with your code. Actually the major problem was how to automatically detect those "OFF" regions. Successful detection of those regions would mean I can automatically extract the data within those regions. Sorry for the misunderstanding.

댓글을 달려면 로그인하십시오.

채택된 답변

Adam Danz
Adam Danz 2019년 6월 11일
편집: Adam Danz 2019년 6월 11일
Here's a rough idea that you can tweek so that it performs better. It involves calculating the slope of each line segment. The areas you circled have slopes much closer to 0 than the rest of the data so I followed these general rules to segregate those areas:
  1. The slope of each segment must be "close to zero". This is defined as a slope between +/- 5 and is set by the "thr" threshold variable. However, there are lots of points that have a slope of 0 that aren't near those areas of interest and there are lots of points within those areas of interest that have a slope outside of these bounds. So more rules were needed.
  2. If there is a series of data points that are all close to zero except for a few that are intermingled, they should be included. So if a slope is outside of the bounds (+/- thr) but within 40 indices to a point that is in the bounds, it's included. This window is set by the variable "wndw". This took care of the data points within the area of interest that had a bit larger slopes but there was still the problem of single data points here and there that had slopes near 0 but were far from the areas of interest. So we need the next rule.
  3. There has to be at least 20 consecutive slopes near 0 or none of them are included. This is defined by the "minCount" variable. This took care of the occasional in-bound slopes that weren't in the area of interest.
I'll let you sift through this to figure out how it works. I recommend plotting some of the variables along the way. As you can see in the image (1/2 of your data), it's imperfect. But I haven't tweeked the 3 parameters mentioned above. You may need aditional rules (or an entirely different method).
% Read in the data
ACC = readmatrix('straightLASER011.xlsx','Sheet',1); %read emg data
x = (1:length(ACC))'; %Replace this with your x data; should be column vector!!!
% Find peaks
[pks,locs] = findpeaks(ACC,'MinPeakDistance',1000); %find peaks
% Calculate slope at each segment
m = diff(ACC)-diff(x);
% Find slopes near 0
thr =5; %+/- thr from 0 is counted as "near 0"
isNearZero = abs(m)<thr;
% Find groups of slopes near 0. A group is defined as a slope that is
% either near 0 or is within 'wndw' indices from a slope near 0.
wndw = 40; %window
% Calculate the start and stop of each grouping (this may need tweeking if you run into errors)
nearZeroIdx = find(isNearZero); %index or nearZeros
nearZeroInt = diff(nearZeroIdx); %interval between nearZeros
inWindowGrp = cumsum(nearZeroInt<=wndw); %grouping by intervals within window
groupMarkers = [0;[0;diff([0;diff(inWindowGrp)]==1)]];
grpStart = nearZeroIdx(groupMarkers == 1);
grpStop = nearZeroIdx([groupMarkers(2:end);-1] == -1);
% Get rid of groups that aren't the min length
minCount = 20; % minimum number of near-0 to be considered a group.
groupTooSmall = (grpStop-grpStart) < minCount;
grpStart(groupTooSmall) = [];
grpStop(groupTooSmall) = [];
% Extract sequences
seqIdx = arrayfun(@(srt,stp) srt:stp, grpStart,grpStop,'UniformOutput', false);
% Plot it out
figure()
plot(x,ACC,'b-','LineWidth',2) %Plot data
hold on
cellfun(@(i)plot(x(i),ACC(i),'r-','LineWidth',2),seqIdx) %plot areas of interest
  댓글 수: 1
jay yawson
jay yawson 2019년 6월 12일
I cannot express how much I appreciate this. You really did put so much work into this for me. Thanks so much. I'll be working on this all day tweaking it and finding the best solution to it. Thank you for this magnificent foot print. Will keep you updated.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Resizing and Reshaping Matrices에 대해 자세히 알아보기

제품


릴리스

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by