Would like a script that removes repeat data

조회 수: 2 (최근 30일)
BA
BA 2022년 9월 23일
댓글: BA 2022년 9월 23일
I'm looking to create a script that removes dates that repeat one after the other. For some reason, the program I used to collect the data does something stupid where they send a prompt twice on the same day, but I only want the program to be sent once. For the repeat dates, I want those dates to be deleted. For example:
Dates_Wrong = ['2/4/21';'2/5/21';'2/5/21';'2/6/21';'2/7/21']
Dates_Wrong = 5×6 char array
'2/4/21' '2/5/21' '2/5/21' '2/6/21' '2/7/21'
You can see here, the 2/5/21 date repeats. I would like to create a script that eliminates that repeat data.
The hard part is that you can't just do unique(x) on the entire dates column because there are different subjects with repeating dates and that is why I'm having trouble. It has to be something where it identifies 2 repeating dates in sequence and removes the more recent date. Here is an example of what our previous dates would look like with the repeat date removed.
Dates_Right = ['2/4/21';'2/5/21';'2/6/21';'2/7/21']
Dates_Right = 4×6 char array
'2/4/21' '2/5/21' '2/6/21' '2/7/21'
This is sort of what I was thinking of doing but I'm not sure if it makes sense
for x=1:length(MorningPrompt.SurveyStartedDate)
if x-1==x %This is where I'm having trouble. I think the rest of the script is fine but I'm not sure how to use this part to account for strings since x isn't the actually string found within that variable
MorningPrompt(x,:) = [];
end
end

채택된 답변

Steven Lord
Steven Lord 2022년 9월 23일
Dates_Wrong = ['2/4/21';'2/5/21';'2/5/21';'2/6/21';'2/7/21']
Dates_Wrong = 5×6 char array
'2/4/21' '2/5/21' '2/5/21' '2/6/21' '2/7/21'
dt = datetime(Dates_Wrong, 'InputFormat', 'M/d/yy')
dt = 5×1 datetime array
04-Feb-2021 05-Feb-2021 05-Feb-2021 06-Feb-2021 07-Feb-2021
differences = diff(dt)
differences = 4×1 duration array
24:00:00 00:00:00 24:00:00 24:00:00
repeated = differences ~= 0
repeated = 4×1 logical array
1 0 1 1
Note that differences and repeated are both one element shorter than dt. Add a true as the first or last element depending on whether you always want to keep the first element or the last.
  댓글 수: 1
BA
BA 2022년 9월 23일
Thank you! This is wonderful.
Just had a few questions.
1) Since I want the first value of each of the repeats, would I just have to set the last line of the logical array to be 1?
2) For the logical indexing, this is the command I'm using. I think it works but its the first time I've used logical indexing so I'm not sure
%My code adapted using your code
Dates = Dataset.Dates;
dt = datetime(Dates, 'InputFormat', 'M/d/yy');
differences = diff(dt);
repeated = differences ~= 0
%Indexing
NonRepeats = Dataset(repeated(:,1)==1, :);

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Time Series Objects에 대해 자세히 알아보기

태그

제품


릴리스

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by