How to extract part of a text file in MATLAB?
조회 수: 7 (최근 30일)
이전 댓글 표시
Okay so I have opened an xml file and want to get the relevant text stored in those files. I tried the following code (noting that the relevant text started after a certain string of characters in the xml file, I tried to use an if statement to extract the text from that point till they reached another point. This would give me less meaningless text so that I could get the text that I want.)
if true
File1 = fopen('Factual1.xml','r');
File2 = fopen('Factual2.xml','r');
File3 = fopen('Colloquial1.xml','r');
File4 = fopen('Colloquial2.xml','r');
File5 = fopen('Hello.xml','r');
File6 = fopen('Hello2.xml','r');
Filenames = {'File1';'File2';'File3';'File4';'File5';'File6'};
B = {0};
for i=File1:File6
A = fscanf(i,'%s');
if ~(strcmp(A,'<w:pw:rsidR="00E3286E"w:rsidRDefault="'))
while((B = fscanf(i,'%c')) ~='\')
B
end
end
end
end
but I keep getting an error, saying that the statement B = fscanf(I,'%c') is not valid. Is there any other way that I can scan the contents of each file, character by character, so that I can extract the amount of text that I want?
댓글 수: 0
답변 (2개)
Ken Atwell
2013년 6월 3일
I'm guessing you're a C programmer. You can't assign B in the while loop's conditional like you are attempting to do. Use two lines:
B = fscanf(i, '%c');
while B ~= '\'
...
B = fscanf(i, '%c');
end
BTW, I believe your for loop is working "accidentally" because MATLAB tends to assign file handles in numeric order -- but is perhaps not guaranteed.
댓글 수: 4
Walter Roberson
2013년 6월 4일
MATLAB appears to follow what POSIX does, which is to allocate the first available (lowest numbered) file descriptor. But that does not mean that the results will always be consecutive.
fid1 = fopen('file1');
fid2 = fopen('file2');
fid3 = fopen('file3');
fclose(fid1);
fclose(fid2);
nfid1 = fopen('nfile1');
nfid2 = fopen('nfile2');
nfid3 = fopen('nfile3');
If we assume nothing had been opened before, fid1 will be 3, fid2 will be 4, fid3 will be 5, then 3 and 4 are released, so nfid1 will be 3, nfid2 will be 4, but nfid3 would be the next available, 6, rather than the consecutive 5.
Paul Metcalf
2013년 6월 4일
You are defining B as a cell matrix, then trying to replace B with a different data type which is invalid. Try first initializing B properly. E.g. B = cell(m,n); Then to assign data into each cell in the array use B{1,1} = 'first line of data'; etc... Your code is really poorly constructed in general. If I have time tonight I'll look at sending you some more tips.
참고 항목
카테고리
Help Center 및 File Exchange에서 Text Data Preparation에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!