Problems reading a text file: saving text file with no changes solves problem.

I am attempting to read the text file output of a calibration program. First I read the headers, and then a block of tab delimited data. The program reads the headers fine, and the first row of data, but won't read any further. Attempts to repeat the textscan yield blank arrays. The strange thing is, if I open the file in a text editor, hit 'save', and close it (no changes), the program will read the entire block fine from then on out. Problem is, I have 500 text files to read, so its not feasible to apply that fix. What is going on?
fid=fopen('D:\Documents\2013-03-22 advance\2013-04-26 Data\244522_D41531\Cal date_6_28_10\244522_M000139-22.log');
InputText=textscan(fid,'%s',3,'delimiter','\n');
HeaderLines{1,1}=InputText{1};
FormatString=repmat('%f',1,12);
InputText=textscan(fid,FormatString);
Data{1,block}=cell2mat(InputText);
fclose(fid);
Text file: 12-28-2010 09:31:54 P Order Temp Gauge Press ACOM Press ACOM Coeff Pressure Gauge Temp ACOM Temp ACOM Coeff Temp 0 24.500 -537.000000 7.669663 -0.00000000004630 0.008602043631 12.288974016004 1407699.033333 92.758391 0.0000000000003991 0.000065322756 0.012842438426 1 24.500 326121.533333 2812.541263 -0.00000000004629 0.008602046767 12.151343623883 1408328.900000 92.799794 0.0000000000003993 0.000065322482 0.012372063805 2 24.500 654594.833333 5623.192599 -0.00000000004631 0.008602055174 12.177162114847 1408365.800000 92.802277 0.0000000000003990 0.000065322769 0.012581150529 3 24.500 736736.433333 6324.341463 -0.00000000004631 0.008602150791 11.960519989727 1408197.100000 92.789478 0.0000000000003991 0.000065322953 0.010464726758 4 24.500 983589.866667 8428.298898 -0.00000000004630 0.008602059501 12.196093036253 1408356.400000 92.801250 0.0000000000003991 0.000065322687 0.012123876870 5 24.500 1313325.066667 11229.927587 -0.00000000004629 0.008602104173 12.418094060714 1408445.600000 92.810781 0.0000000000003991 0.000065323104 0.014945905254 6 24.500 1643711.266667 14026.375710 -0.00000000004630 0.008602020141 12.230445831539 1408567.700000 92.815995 0.0000000000003991 0.000065322499 0.013012225901 7 24.500 1892050.800000 16121.981903 -0.00000000004630 0.008602057563 12.204692474872 1408551.400000 92.815155 0.0000000000003992 0.000065322770 0.012724860383 8 24.500 2057857.466667 17517.769870 -0.00000000004631 0.008602090628 12.015490546354 1408526.433333 92.811144 0.0000000000003990 0.000065322682 0.010778231650 9 24.500 1809198.100000 15423.503094 -0.00000000004626 0.008602026963 12.163358626346 1408227.700000 92.793604 0.0000000000003994 0.000065322680 0.012293714268 10 24.500 1478632.300000 12630.338144 -0.00000000004631 0.008602149028 12.172133596978 1408027.533333 92.781184 0.0000000000003990 0.000065323672 0.012555195047 11 24.500 1231039.900000 10531.498008 -0.00000000004631 0.008602068894 12.194385935947 1408078.533333 92.782606 0.0000000000003990 0.000065322715 0.012032430559 12 24.500 1148544.133333 9830.955916 -0.00000000004630 0.008602092790 12.149688220127 1408086.466667 92.783394 0.0000000000003990 0.000065322882 0.011941009410 13 24.500 819116.900000 7027.269743 -0.00000000004632 0.008602049288 12.261455157199 1407975.700000 92.777067 0.0000000000003989 0.000065322782 0.013312766495 14 24.500 490319.900000 4218.922748 -0.00000000004630 0.008602086789 12.278713009730 1407891.700000 92.771717 0.0000000000003991 0.000065322744 0.013208242596 15 24.500 244100.500000 2109.282225 -0.00000000004631 0.008602162914 12.249573904351 1407739.233333 92.761729 0.0000000000003992 0.000065322867 0.013156009197 16 24.500 162120.833333 1405.647447 -0.00000000004631 0.008602085856 12.287313925406 1407901.000000 92.772107 0.0000000000003989 0.000065322994 0.013025389142

답변 (1개)

Cedric
Cedric 2013년 7월 13일
편집: Cedric 2013년 7월 13일
The text editor might replace tabs with spaces actually, or \r or \n with \r\n, or weird characters with spaces. Did you try using
InputText = textscan(fid, FormatString, 'delimiter' '\t') ;
for reading the data? TEXTSCAN has \t as a delimiter anyway, but it might play a role to limit delimiters to \t only ..
(the name of your variable InputText doesn't reflect its class by the way, as it is a cell array with numeric arrays in each cell)
PS: if it still doesn't work, execute your code for reading the header, and then do the following:
>> buffer = fread(fid) ;
>> buffer(1:40)'
this will display the content interpreted as positive (8 bits) integers, and allow you to check which "characters you have in there"..
Executing this on the content that you provided (which was filtered by the forum and by my editor), I get:
48 32 32 32 50 52 46 53 48 48 32 32 32 32 45 53 51 55 46 48 48 48 48 48 ...
This codes "0 24.500 -537.000000" ... (look at an ASCII table to get the correspondence). We can see that tabs that I put manually in the editor initially for the test were replaced by series of spaces (code 32) when I saved the file.

댓글 수: 3

None of the delimiters make any difference, including 'treat multiple as one, etc'. I ran the buffer command like you said. At the end of the line, there are ~120 '0' (null), a '13' (CR), '10' (new line) and then it starts into the next line numbers.
Here is a link to the file if you want to check it out yourself. http://www.jpmassey.com/264157_SUB0011-1.log
Saving the file turns the nulls into spaces.
Ok, I can help you with this issue, but first a question: there are multiple blocks in your log file and the code that you provide would not get them all. Is your goal to extract them all ultimately?

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Large Files and Big Data에 대해 자세히 알아보기

질문:

2013년 7월 13일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by