compare datasets and remove not common rows

Hello all, I have to compare two datasets (a= 89072 x 13 & b=89268 x 37) that must have the same rows, so I have to remove the rows that are not common. The columns number for the two dataset are not the same, so I can't 'Intersect' them. How can I remove these rows please?
Thank you

댓글 수: 4

dpb
dpb 2013년 10월 8일
"Same rows" by what definition of "same"?
Doriana
Doriana 2013년 10월 8일
I removed some observations (outliers) from dataset 'a' and now I have to eliminate these observations also from dataset b.
But there are 13 columns in one and 37 in the other so you are going to have to define some type of definition for "same". Once you've defined "sameness", then we can help identify and remove.
Doriana
Doriana 2013년 10월 8일
Dear Sean, the two datasets (a & b) initially had the same observations but different columns. In the first dataset (a) there are quantitative variables while in the second ds (b) the qualitative variables. I removed some observations (outliers) from dataset 'a' and I have to eliminate these observations also from dataset 'b' because the two datset must have the same observations. thank you

댓글을 달려면 로그인하십시오.

 채택된 답변

kittu
kittu 2013년 10월 8일

0 개 추천

What do you mean by " I have to remove the rows that are not common"? if you mean to reduce the size of the other dataset(b) and make it equal to
a,
then you can use
b(1:length(a),:)

댓글 수: 4

Doriana
Doriana 2013년 10월 8일
hi Kittu, I tried your command but doesn't work. I get this error:
Error using getobsindices (line 71) Observation index exceeds dataset dimensions.
Error in dataset/subsrefParens (line 16) [obsIndices, numObsIndices] = getobsindices(a, s(1).subs{1});
Error in dataset/subsref (line 69) [varargout{1:nargout}] = subsrefParens(a,s);
thank you
kittu
kittu 2013년 10월 8일
It seems the index exceeded dataset dimensions.So may be you have mistakenly swapped a and b. I assume that you want to reduce the number of rows in b variable and want to make it to number of rows in a. I tested in my system, it works!
Doriana
Doriana 2013년 10월 8일
I just solved the problem running these command:
NDG_X= get(a,'ObsNames');
NDG_Xc= get(b,'ObsNames');
[~,ia,ib] = intersect(NDG_X,NDG_Xc,'stable') ;
a_ridotto=a(ia,:);
b_ridotto=b(ib,:);
thanks anyway, Doriana
Doriana
Doriana 2013년 10월 8일
yes, you're right,
I have mistakenly swapped a and b,
however I could not use your command because 'a' and 'b' must have the same observations and not only the same size...

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기

질문:

2013년 10월 8일

댓글:

2013년 10월 8일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by