How to use textscan to read data with missing values?

조회 수: 3 (최근 30일)
Zoe
Zoe 2011년 6월 5일
댓글: Christopher Conatser 2016년 9월 27일
I found it is not possible to use textscan to import a .txt file while the data itself contains missing values.
So for example, I have a test.txt, where * represents missing data(empty) and the delimiter is just whitespace:
1 2 3 4
5 6 A *
7 8 A *
9 * * 10
Is there any exports can help me?? Thanks a lot!

채택된 답변

Jan
Jan 2011년 6월 6일
Does the file contain '*' as markers for empty value, or did you insert the stars here for display purposes only?
If there are no values in the real file, if a value is missing:
Data = textscan(FID, '%f%f%f%f', 'Delimiter', ' ', 'EmptyValue', Inf);
If there are stars in the file, you could use an intermediate step:
Str = fileread(FileName);
Str = strrep(Str, '*', '');
Data = textscan(Str, '%f%f%f%f', 'Delimiter', ' ', 'EmptyValue', Inf);

추가 답변 (2개)

the cyclist
the cyclist 2011년 6월 5일
There is a file in the FEX called "readtext" that should handle this situation:

Zoe
Zoe 2011년 6월 6일
Thanks for all!
And the problem solved. The delimiter is actually tab. My answer is: Data = textscan(FID, '%f%f%f%f', 'Delimiter', '\t', 'EmptyValue', 0);
And it totally works now!
However, if the delimiter is just whitespace (though rare), I still don't think textscan can handle it. Logically yes, but it seems to confuse the missing value with the delimiter when you are importing a .txt file.
  댓글 수: 4
Zoe
Zoe 2011년 6월 12일
Thanks a lot~
Christopher Conatser
Christopher Conatser 2016년 9월 27일
To extend this question further...I have a similar problem, but the (utterly malicious!) text export function for my instrument also has an irregular number of spaces between different columns, and they also vary depending on number of significant figures. Jan, do you (or anyone else) have any suggestions for dealing with it?
Sample tables (already cleaned up considerably):
SAMPLE BOTTLE TIME SOURCE ERROR LIQUID
------- ------ ---- -- -- ------
1,10 1 15:38 F 320
2,10 1 15:42 F 306
3,10 1 15:48 F 310
SAMPLE BOTTLE TIME SOURCE ERROR LIQUID
------- ------ ---- -- -- ------
1,5 13 22:41 F 198
2,5 13 23:35 F NM *
3,5 13 00:40 F NM *
4,5 13 01:04 F 196
No matter what I've tried (tab delimiter, space delimiter, multispace literals in the formatSpec, setting 'MultipleDelimsAsOne' to false,...) everything skips the "ERROR" column when it is empty.
formatSpec = ' %f,%*f %f %{HH:mm}D %s %s %f';
Thanks for your help!

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Standard File Formats에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by