Delete only consecutive repeated string entries from a dataset in matlab

Question

avantika 2013년 8월 29일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/85896-delete-only-consecutive-repeated-string-entries-from-a-dataset-in-matlab

hi!

I am relatively new to matlab. I have a dataset having three columns, time, pitch and notation. for eg

 time   pitch notation
5725 329.63 G
5800 329.63 G
5900 311.13 M
5900 311.13 M
6000 570.40 P

I want to remove duplicates occurring consecutively in the file such that the order remains the same. so the output will be:

 time   pitch notation
5725 329.63 G
5900 311.13 M
6000 570.40 P

I am currently using matlab 7.9.0 so the first of unique command id not supported. Can anyone tell me how to go about it further.

[EDITED, table formatted, Jan]

댓글 수: 5
이전 댓글 3개 표시이전 댓글 3개 숨기기

avantika 2013년 8월 29일

편집: avantika 2013년 8월 29일

MATLAB Online에서 열기

hi!

I am relatively new to matlab. I have a dataset having three columns, time, pitch and notation. for eg

 time   pitch notation
5725 329.63 GM
5800 329.63 GM
5900 311.13 MM
5900 311.13 MM
6000 570.40 PM
6725 329.63 GM
6800 329.63 GM
7900 311.13 MM
8900 311.13 MM
9000 570.40 PM
9500 570.40 PM
1000 570.40 PM
I want to remove repeated enteries occurring consecutively in the pitch and notation column in the file but the order of the dataset should remain the same. so the output will be:
 time   pitch notation
5725 329.63 GM
5900 311.13 MM
6000 570.40 PM
6725 329.63 GM
7900 311.13 MM
9000 570.40 PM

Jan 2013년 8월 30일

편집: Jan 2013년 8월 30일

MATLAB Online에서 열기

I still do not understand the empty lines and the type of the input is not explained here. Is this a text file, a cell containing strings, or or is pitch a field of a struct, which contains a double vector?

What should happen for:

5725 329.63 GM
5800 329.63 GM
5725 329.63 GM
5800 329.63 GM

So does "consecutively" mean the position in the array or should a line vanish, if there is any equal set of values before, even if other lines appear in between?

I assume that the solution is very easy, if you define the wanted procedure and the class of the input exactly.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Andrei Bobrov 2013년 8월 29일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/85896-delete-only-consecutive-repeated-string-entries-from-a-dataset-in-matlab#answer_95475

편집: Andrei Bobrov 2013년 8월 30일

MATLAB Online에서 열기

    C  ={  'time'   'pitch' 'notation'
5725 329.63 'GM'
5800 329.63 'GM'
5900 311.13 'MM'
6000 570.40 'PM'
6725 329.63 'GM'
6800 329.63 'GM'
7900 311.13 'MM'
8900 311.13 'MM'
9000 570.40 'PM'
9500 570.40 'PM'
1000 570.40 'PM'};
    d = cell2dataset(C); % your dataset - array
 [~,~,ii] = unique(d.notation);
 out = d([true;diff(ii)~=0],:);

ADD without dataset array

C  ={  'time'   'pitch' 'notation'
5725 329.63 'GM'
5800 329.63 'GM'
5900 311.13 'MM'
6000 570.40 'PM'
6725 329.63 'GM'
6800 329.63 'GM'
7900 311.13 'MM'
8900 311.13 'MM'
9000 570.40 'PM'
9500 570.40 'PM'
1000 570.40 'PM'};
[ii,ii,ii] = unique(C(2:end,3));
out = C([true(2,1);diff(ii)~=0],:);

ADD 2

scale = {'GM';'PM'};
[~,ii] = ismember(C(2:end,3),scale);
i1 = ii > 0;
C1 = C(i1,:);
out = C1([true(2,1);diff(ii(i1))~=0],:);

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

Andrei Bobrov 2013년 8월 30일

see block ADD2

avantika 2013년 9월 2일

HI

I get an error when i use the commands given by you for ismember:

scale = {'NL';'NM';'NU';'RL';'RM';'RU';'GL';'GM';'GU';'mL';'mM';'mU';'DL';'DM';'DU';'SL';'SM';'SU';'PL';'PM';'PU';};

>> [~,ii] = ismember(C(2:end,3),scale);

i1 = ii > 0;

C1 = C(i1,:);

out2 = C1([true(2,1);diff(ii(i1))~=0],:);

??? Error using ==> dataset.subsref at 82 Dataset array subscripts must be two-dimensional.

Error in ==> ismember at 78 found = find(a(i)==s(:)); % FIND returns indices for LOC.

댓글을 달려면 로그인하십시오.

Answer 2

Simon 2013년 8월 29일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/85896-delete-only-consecutive-repeated-string-entries-from-a-dataset-in-matlab#answer_95411

MATLAB Online에서 열기

Hi!

I don't understand why the second entry is removed. Is a "duplicate" defined if all three columns match or only the second and third?

Does the unique command in 7.9 support rows? Like

b = unique(A, 'rows')

Does your data set consist of single lines for each entry? You could try to put each data set as a string in a cell array and use unique of the cell array.

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

avantika 2013년 8월 29일

hello sir,

thanks for replying. Yes the dataset consists of single lines of each entry. The first column of dataset is time stamp, second column is pitch values and the third column is of notation for corresponding pitch values.When I use the following unique command to delete the duplicate pitch values it firsts sorts the data in ascending order and deletes the repeating rows

[~,idx]=unique(ds3(:,2));

withoutduplicates=ds3(idx,:)

However i dont want the data the data to be sorted first , i want the command to read it line by line and keep deleting the duplicating lines in accordance with replicated pitch values.

For this matlab does have 'first' command with unique but that is not supported with matlab vs 2009 as it shows the following error: ??? Error using ==> getvarindices at 25 Unrecognized variable name 'first'.

Error in ==> dataset.unique at 34 vars = getvarindices(a,vars,false);.

Is there any solution for this , can i delete rows without sorting in matlab 2009.

Simon 2013년 8월 29일

Hi!

Try it with a cell array of strings, as proposed.

The unique command has additional return values that contain the relation between the sorted output and the input. Check the documentation for more information.

avantika 2013년 8월 29일

편집: avantika 2013년 8월 29일

hi!

i checked the class of dataset before the conversion to cell array of strings .

class(ds3.pitch)

ans =

double

class(ds3.time)

ans =

double

class(ds3.notation)

ans =

cell

I am not able to convert it to cell array of strings.

댓글을 달려면 로그인하십시오.

Delete only consecutive repeated string entries from a dataset in matlab

댓글 수: 5
이전 댓글 3개 표시이전 댓글 3개 숨기기

채택된 답변

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

추가 답변 (1개)

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

Delete only consecutive repeated string entries from a dataset in matlab

댓글 수: 5 이전 댓글 3개 표시이전 댓글 3개 숨기기

채택된 답변

댓글 수: 6 이전 댓글 4개 표시이전 댓글 4개 숨기기

추가 답변 (1개)

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 5
이전 댓글 3개 표시이전 댓글 3개 숨기기

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기