How to compare the colums ( 2nd and 3rd) of 2 .csv files based on a condition

Question

Ganesh Kini 2020년 6월 23일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/553471-how-to-compare-the-colums-2nd-and-3rd-of-2-csv-files-based-on-a-condition

편집: dpb 2020년 6월 26일

I have a program based on some functionality and the results are displayed on 2 .csv files ( file1 and file2)

The files consists of same rows and columns, the columns are named as condition,supply1,supply2

I want to read both the files into the program, load them and then compare the contents of supply1,supply2, based on the condition and number.

We have compare the respective contents and display the difference of supply1 and suppy2 between those 2 files.

Please help me on this.

댓글 수: 9
이전 댓글 7개 표시이전 댓글 7개 숨기기

Ganesh Kini 2020년 6월 24일

편집: Walter Roberson 2020년 6월 24일

MATLAB Online에서 열기

file1 = fopen('file1.csv');
file2 = fopen('file2.csv');  
data1 = textscan(file1, "%s,%f,%f,%d\n", "HeaderLines", 1, "Delimiter",",")
 fclose(file1);
fclose(file2);
data1 =
{
  [1,1] =
  {
    [1,1] = tb
    [2,1] = tb
    [3,1] = tb
    [4,1] = ta
    [5,1] = ta
    [6,1] = tc
    [7,1] = tc
    [8,1] = tc
  }
  [1,2] =
     1.070000
     0.070000
     1.170000
     1.270000
     1.370000
     1.470000
     1.470000
     1.670000
  [1,3] =
    -1.1100
    -1.2300
    -1.6700
    -1.5700
    -1.3700
    -1.0700
    -1.4700
    -3.9700
  [1,4] =
    1
    2
    3
    1
    2
    1
    2
    3
}
>> data2 = textscan(file1, "%s,%f,%f,%d\n", "HeaderLines", 1, "Delimiter",",");
data2 =
{
  [1,1] =
  {
    [1,1] = ta
    [2,1] = ta
    [3,1] = tb
    [4,1] = tb
    [5,1] = tb
    [6,1] = tc
    [7,1] = tc
    [8,1] = tc
  }
  [1,2] =
     5.010000
     0.070000
     1.120000
     3.340000
     4.770000
     4.500000
     2.400000
     1.550000
  [1,3] =
    -6.1100
    -4.2300
    -7.6700
    -8.5700
    -2.3700
    -2.0700
    -4.4700
    -9.9700
  [1,4] =
    1
    2
    1
    2
    3
    1
    2
    3
}

Now I am finiding problem in finding the difference

I want to get the difference between the 2nd and 3rd column based on condition and number. How do I map it and then compare the value?

Eg: condition ta and number 1 of the 4th row of file1 has to compare with condition ta and number 1 of the 1st row of file2

How do I find the difference in this case? Please guide me

Walter Roberson 2020년 6월 24일

innerjoin() ?

Ganesh Kini 2020년 6월 24일

편집: Ganesh Kini 2020년 6월 24일

I don't think innerjoin () would work in this case.

I need to compare the 2nd and 3rd column based on the condition and number respectively in file 1 and file 2.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

KSSV 2020년 6월 24일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/553471-how-to-compare-the-colums-2nd-and-3rd-of-2-csv-files-based-on-a-condition#answer_456151

편집: KSSV 2020년 6월 24일

MATLAB Online에서 열기

clc; clear all
file1 = fopen('file1.csv');
file2 = fopen('file2.csv');  
data1 = textscan(file1, "%s,%f,%f,%d\n", "HeaderLines", 1, "Delimiter",",") ; 
data2 = textscan(file2, "%s,%f,%f,%d\n", "HeaderLines", 1, "Delimiter",",") ; 
fclose(file1);
fclose(file2);
str1 = data1{1} ; 
str2 = data2{1} ; 
data1{1} = [] ; M1 = cell2mat(data1) ; 
data2{1} = [] ; M2 = cell2mat(data2) ; 
 
data1 = M1(:,1:2) ; 
data2 = M2(:,1:2) ; 
C1 = M1(:,end) ; 
C2 = M2(:,end) ; 
s = unique(str1) ; N = unique(C1) ; 
m = length(s) ; n = length(N) ; 
iwant = cell(m,n) ; 
for i = 1:m
  idx = strcmp(str1,s{i}) ; 
  idy = strcmp(str2,s{i}) ; 
  % Check same values in C1, C2
  for j = 1:n
    id1 = C1(idx)==N(j) ; 
    id2 = C2(idx)==N(j) ; 
    iwant{i,j} = data1(id1,:)-data2(id2,:) ; 
  end 
end

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

KSSV 2020년 6월 24일

You are getting wierd errors...

dpb 2020년 6월 24일

편집: dpb 2020년 6월 24일

"Error using cell2mat (line 45)

All contents of the input cell array must be of the same data type."

Of course -- the first column is nonumeric so is cellstr(), remainder are.

I'd suggest using readtable instead of textscan first, then turn the first column into a categorical variable and use it and the fourth for grouping.

댓글을 달려면 로그인하십시오.

Answer 2

dpb 2020년 6월 25일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/553471-how-to-compare-the-colums-2nd-and-3rd-of-2-csv-files-based-on-a-condition#answer_456721

편집: dpb 2020년 6월 25일

MATLAB Online에서 열기

Has to be a more direct way, but following works if I understand the request --

tF1=readtable('file1.csv'); tF1.condition=categorical(tF1.condition);  % read tables, condition
tF2=readtable('file2.csv'); tF2.condition=categorical(tF2.condition);  % turned to categorical
[g1,id1,id2]=findgroups(tF1.condition,tF1.number);                     % get group number each file
g2=findgroups(tF2.condition,tF2.number);
[~,ixg1]=sort(g1); [~,ixg2]=sort(g2);                                  % groups may not be in order
dS1=accumarray(g2,tF2.supply1)-accumarray(g1,tF1.supply1);             % compute Supply 1 differences
dS2=accumarray(g2,tF2.supply2)-accumarray(g1,tF1.supply2);             % compute Supply 1 differences
tF1F2=table(id1,id2,tF2.supply1(ixg2),tF1.supply1(ixg1),dS1,tF2.supply2(ixg2),tF1.supply2(ixg1),dS2);
tF1F2.Properties.VariableNames={'condition','number','F2 S1','F1 S1','diff S1','F2 S2','F1 S2','diff S2'}
% results in...
tF1F2 =
  8×8 table
    condition    number    F2 S1    F1 S1    diff S1    F2 S2    F1 S2    diff S2
    _________    ______    _____    _____    _______    _____    _____    _______
       ta         1.00     5.01     1.27       3.74     -6.11    -1.57     -4.54 
       ta         2.00     0.07     1.37      -1.30     -4.23    -1.37     -2.86 
       tb         1.00     1.12     1.07       0.05     -7.67    -1.11     -6.56 
       tb         2.00     3.34     0.07       3.27     -8.57    -1.23     -7.34 
       tb         3.00     4.77     1.17       3.60     -2.37    -1.67     -0.70 
       tc         1.00     4.50     1.47       3.03     -2.07    -1.07     -1.00 
       tc         2.00     2.40     1.47       0.93     -4.47    -1.47     -3.00 
       tc         3.00     1.55     1.67      -0.12     -9.97    -3.97     -6.00 
>> 

The alternative could be to catenate the two files vertically adding a "fileID" variable and operate over it.

But, none of the splitapply workflows are set up to let one use more than one grouping variable over a selection of variables. accumarray builds the differences from the grouping variable ok, but assembling the final table requires ensuring put the rows in with the results of it in the proper location by using the sorted order of the grouping variable.

I'm sure I'm overlooking something simple and obvious but doesn't come to me at the moment.

댓글 수: 2
없음 표시없음 숨기기

KSSV 2020년 6월 25일

OP is using octave...there is no readable in octave.

dpb 2020년 6월 25일

편집: dpb 2020년 6월 26일

This is a MATLAB forum...but there was a response earlier that said was trying both.

ADDENDUM:

Altho the table really has no bearing on the solution other than the output display. I was thinking rowfun or varfun would probably be the ticket when wrote first comment...but the different groupings need makes funky.

Actually, the simplest is probably to resort both data sets so the groupings are in order to begin with -- then the two grouping vectors will be the same and sequential so don't have to resort them.

ADDENDUM SECOND:

" the simplest is probably to resort both data sets ..."

And, then, in fact, there's no longer a need for the explicit grouping variables as long as the two datasets have the same number observations for the same groups -- one can just do the analysis on a row-by-row basis.

댓글을 달려면 로그인하십시오.

How to compare the colums ( 2nd and 3rd) of 2 .csv files based on a condition

댓글 수: 9
이전 댓글 7개 표시이전 댓글 7개 숨기기

답변 (2개)

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

댓글 수: 2
없음 표시없음 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

How to compare the colums ( 2nd and 3rd) of 2 .csv files based on a condition

댓글 수: 9 이전 댓글 7개 표시이전 댓글 7개 숨기기

답변 (2개)

댓글 수: 8 이전 댓글 6개 표시이전 댓글 6개 숨기기

댓글 수: 2 없음 표시없음 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 9
이전 댓글 7개 표시이전 댓글 7개 숨기기

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

댓글 수: 2
없음 표시없음 숨기기