Removal of duplicate data

Question

Jørgen Sørebø Myhre 2021년 3월 24일

1
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/782348-removal-of-duplicate-data

댓글: Walter Roberson 2021년 3월 25일

Hello everyone!

I'm working on a project where the goal is to recieve data from a flying object. The data is recieved from different stations along the flight path. The data need to be plotted in MatLab and displayed graphically. The data is stored in a .txt-file, and I have managed to import this data to MatLab. The recieved data contains information like the voltage, RSSI and times between each packet++.

The problem with multiple stations, is that the same data sent by the flying object will be collected at multiple stations. The issue i have is to remove this duplicate data, so that it is not plotted twice.

I have attached a picture from the .txt-file. Column 2 are the station numbers and column 3 are the package numbers. Red line to show separation between stations and duplicate data between the blue lines.

As I'm not very skilled in MatLab, could someone point me in the right direction on what i can do to remove this data?

Also, would it be possible to make a plot so that the graph starts with the first station, continues with the next, and so on.. ?

Sincerly Jørgen

(English is not my first language, so please excuse me)

-

댓글 수: 2
없음 표시없음 숨기기

Walter Roberson 2021년 3월 24일

I see you have packet number 54 detected by station 170 and station 187. How do you decide which of the two to keep? For example do you want to keep the one with highest RSSI? Is there a time stamp and you want to keep the one with the earliest time stamp?

Jørgen Sørebø Myhre 2021년 3월 24일

That's correct, the packages detected by station 187 from 37-54 are duplicates. I basically would like to remove these duplicates.

So it would look something like this:

Station 170 recieves packages from 1-54, station 187 recieves from 55-108, and so on for all the next stations.

The times are in LSB and MSB which is respectively column 4 and 5, these are just the times between each package sent.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

William Rose 2021년 3월 24일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/782348-removal-of-duplicate-data#answer_656958

@Jørgen Sørebø Myhre

I assume your data is in an array called data() with 9 columns and many rows, and column 2 is the station number.

Sort data by the columns with priority 1,3,4,5,6,7,8,9,2. By using column 2 as the last for sorting, rows that only differ in column 2 will be adjacent after the sort. Then you compare each row to the next row. If they are identical except for column 2, delete the next row. Rpeat until the next row does not match, then proceed to the next row, etc.

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

Jørgen Sørebø Myhre 2021년 3월 25일

First of all, thanks to both of you for taking your time to help me.

I have my data saved in "Raw", should i replace "YourData" with that?

Also i get an error with the "subset", this might be something i need to replace aswell?

Walter Roberson 2021년 3월 25일

MATLAB Online에서 열기

 Subset = Raw(b, 2)n
 [~, IA] = unique(Subset, 'rows', 'stable');
  selected_entries = Raw(IA,:);

This assumes that column 2 by itself is enough to determine uniqueness.

댓글을 달려면 로그인하십시오.

Removal of duplicate data

댓글 수: 2
없음 표시없음 숨기기

채택된 답변

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Removal of duplicate data

댓글 수: 2 없음 표시없음 숨기기

채택된 답변

댓글 수: 4 이전 댓글 2개 표시이전 댓글 2개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기