searching a given line in a text file

조회 수: 3 (최근 30일)
Ram
Ram 2011년 2월 28일
The following file is a txt file in sdf format(chemical structures) It looks sumthin lik this
7 9 1 0 0 0 0
7 14 1 0 0 0 0
8 10 1 0 0 0 0
8 15 1 0 0 0 0
9 10 2 0 0 0 0
9 16 1 0 0 0 0
10 17 1 0 0 0 0
12 13 1 0 0 0 0
13 18 1 0 0 0 0
13 19 1 0 0 0 0
13 20 1 0 0 0 0
M END
> <PUBCHEM_COMPOUND_CID>
2244
> <PUBCHEM_COMPOUND_CANONICALIZED>
1
> <PUBCHEM_CACTVS_COMPLEXITY>
212
I need to extract just the information under the CID number field and there could be multiple CID number fields in a single file.. How should I go about this?? Any help would be appreciated..

채택된 답변

Ram
Ram 2011년 3월 1일
I tried sumthin lik this
[A,B]=uigetfile('*.sdf','sdf');
C=fopen(A,'r');
n=0;
i=<ui>; %number of structures -- wil be obtained from the user
pubchem_id=[];
z=<ui>*300; %rough approximation-- 300lines for each structure
for j=1:1:z
D=fgetl(C);
if strcmp('> <PUBCHEM_COMPOUND_CID>',D)
E=fgetl(C);
E = str2double(E);
pubchem_id=[pubchem_id; E]
end
end
and it worked :)
  댓글 수: 2
David Young
David Young 2011년 3월 1일
The for loop that looks at 300 lines only is a hostage to fortune: what if there are more than 300 lines for a structure? You could avoid this by using a while loop that kept looking until it either found a particular line, or came to the end of the file, and that would be far more robust.
Ram
Ram 2011년 3월 4일
I din use while loop because there is no such thing in an sdf that marks the end of the file.. lik for instance $$$$ marks the end of each structure and there could be multiple $$$$'s depending on the number of structures.. a structure averagely has about 180 lines so 300 is actually redundant and when thr are more 300 lines it wil be compensated by the ones that have less than 300..

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Walter Roberson
Walter Roberson 2011년 2월 28일
Not much you can do except fgetl() through the file until you encounter the M END line, and do the extraction work from there. The ease of extracting after that would depend upon the regularity of the data after that and upon which fields you were interested in.
  댓글 수: 1
Ram
Ram 2011년 3월 1일
thank u so much:) i have built my code based on ur reply only :)

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Workspace Variables and MAT Files에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by