Searching the contents of a file with items in a second file

I have two files - File1.txt and File2.txt. I'm searching the first column of File 1 with items from File2.
File1.txt (tab-delemited):
Ex_efxb 0.0023
MSeef 2.3000
F_ecjc 0.3338
MWEEI -0.111
DDAIij 17.777
File2.txt
MSeef
F_ecjc
Required output: The following were found in File 1:
MSeef 2.3000 F_ecjc 0.3338
I have the following script which is not giving the right output, instead it says items not found. clear all; clc;
fid = fopen('File1.txt')
if fid ==-1
disp('Reaction_Flux file open not successful')
else
% Read characters and numbers into separate elements
% in a cell array
rxn_flux = textscan(fid,'%s %f');
A = rxn_flux{1};
B = rxn_flux{2};
len1 = size(A);
closeresult1 = fclose(fid);
if closeresult1 == 0
disp('Reaction_Flux file close successful')
else
disp('Reaction_Flux file close not successful')
end
end
fid = fopen('File2.txt')
if fid ==-1
disp('Flux file open not successful')
else
% Read characters and numbers into separate elements
% in a cell array
[rxn] = textread('File2.txt','%s');
%rxn = textscan(fid,'%s');
C = rxn;
len2 = size(C);
closeresult1 = fclose(fid);
if closeresult1 == 0
disp('Reaction_Flux file close successful')
else
disp('Reaction_Flux file close not successful')
end
end
found = 0;
for i = 1:len1
RXNFLUX = strcmpi(A(i),C);
if RXNFLUX
found = 1;
break
end
end
if found
data = [];
for k = 1:len1
for m = 1:len2
data = [data; A(k),C(m)]
%print to file
fprintf(data,'%s\t %d\n','Out.txt');
end
end
else
disp('rxn not found')
end
Can anyone help? Thanks

 채택된 답변

Andrew Newell
Andrew Newell 2011년 3월 29일
The explanation: strcmpi does a case-insensitive match (if you want case-sensitive, change to strcmp ) and returns a logical array with value 1 for every element of A that matches C{i} and 0 for every element that does not. The vertical line means "or", so if an element of RXNFLUX is already 1, it stays 1; but if it is 0 and strcmpi finds a match, it is reset to 1. For example, initially RXNFLUX is all zeros. After the first iteration, it is [0; 1; 0; 0; 0] because MSeef is the second element in A. After the second iteration, the output of the search is [0; 0; 1; 0; 0] because F_ecjc is the third element of A. Combining this with RXNFLUX using "or" gives [0; 1; 1; 0; 0].
The line
iLines = find(RXNFLUX);
finds the indices of all the elements of RXNFLUX that are equal to 1.

추가 답변 (4개)

Andrew Newell
Andrew Newell 2011년 3월 29일
You've got one mistake:
len2 = length(B);
should be
len2 = length(C);
That must have crept in when you were changing size to length.
Andrew Newell
Andrew Newell 2011년 3월 28일
The file reading is fine (except that it would be better to use
len1 = length(A)
len2 = length(B)
so that len1 and len2 are scalars. However, there are a lot of problems with the processing, including
  1. accessing the cell array A using A(i) instead of A{i},
  2. testing for string matches ( found=1 ) before you are finished searching, and
  3. not opening the file Out.txt.
Note also that if you search A for elements of C instead of the reverse, you get the indices you need for the next part.
Here is code that will do the analysis:
fid = fopen('Out.txt','w');
fprintf(fid,'The following were found in File 1:\n')
RXNFLUX = false(size(A));
for i = 1:len2
RXNFLUX = RXNFLUX | strcmpi(C{i},A);
end
if any(RXNFLUX)
iLines = find(RXNFLUX);
for i=1:length(iLines)
fprintf(fid,'%s\t %d\n',A{iLines(i)}, B(iLines(i)));
end
else
disp('rxn not found')
end
fclose(fid);
James
James 2011년 3월 29일
Hi Andrew,
I apologise unreservedly for my very late reply. I have not been been able to log in to the site until some minutes ago. Thank you for the solution. I have connected your code section to the file reading of mine and somehow I got some errors:
??? Index exceeds matrix dimensions.
Error in ==> sollutionRxnFluxFiles at 47
RXNFLUX = RXNFLUX | strcmpi(C{i},A);
Here 's the latest code:
fid = fopen('File1.txt')
if fid ==-1
disp('Reaction_Flux file open not successful')
else
% Read characters and numbers into separate elements
% in a cell array
rxn_flux = textscan(fid,'%s %f');
A = rxn_flux{1};
B = rxn_flux{2};
len1 = length(A);
closeresult1 = fclose(fid);
if closeresult1 == 0
disp('Reaction_Flux file close successful')
else
disp('Reaction_Flux file close not successful')
end
end
fid = fopen('File2.txt')
if fid ==-1
disp('Flux file open not successful')
else
% Read characters and numbers into separate elements
% in a cell array
[rxn] = textread('File2.txt','%s');
%rxn = textscan(fid,'%s');
C = rxn;
len2 = length(B);
closeresult1 = fclose(fid);
if closeresult1 == 0
disp('Reaction_Flux file close successful')
else
disp('Reaction_Flux file close not successful')
end
end
fid = fopen('Out.txt','w');
fprintf(fid,'The following were found in File 1:\n')
RXNFLUX = false(size(A));
for i = 1:len2
RXNFLUX = RXNFLUX | strcmpi(C{i},A);
end
if any(RXNFLUX)
iLines = find(RXNFLUX);
for i=1:length(iLines)
fprintf(fid,'%s\t %d\n',A{iLines(i)}, B(iLines(i)));
end
else
disp('rxn not found')
end
fclose(fid);
Thanks!
James
James 2011년 3월 29일
Now, it works beautifully! Many thanks, Andrew.
Do you mind explaining this bit of code?
for i = 1:len2
RXNFLUX = RXNFLUX | strcmpi(C{i},A);
end

카테고리

도움말 센터File Exchange에서 File Operations에 대해 자세히 알아보기

태그

질문:

2011년 3월 28일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by