How do I import a file containing both numbers and strings

I've looked around and found multiple posts regarding this issue, however, none of them really helped me with this problem.
I'm trying to import a text file cointaining a square matrix, defined by the user, that contains values and the string X like, for example, this one:
17.01 24.02 1.03 8.04
23.06 5.07 7.08 14.09
4.01 X 13.03 20.04
10.06 12.07 19.08 21.09
11.01 18.02 25.03 2.04
I've tried using importdata but when it reaches the X, the rest of the line gets replaced by NaN and the lines underneath get ignored. This is what happens:
17.0100 24.0200 1.0300 8.0400
23.0600 5.0700 7.0800 14.0900
4.0100 NaN NaN NaN
Since the matrix is given by the user (randomly), its size and the 'X' positions will always be different. Knowing that the matrixes are always square, what can I do to solve this?

 채택된 답변

Stephen23
Stephen23 2018년 11월 22일
편집: Stephen23 2018년 11월 22일

2 개 추천

Use textscan and set its 'TreatAsEmpty' option to 'X'.
This will be much more efficient than importing and post-processing (e.g. with regexp).

댓글 수: 3

TADA
TADA 2018년 11월 22일
편집: TADA 2018년 11월 22일
it is about 2-3 times faster than my solution
but how do you solve the part about not knowing the size of the matrix?
granted the matrix is square you can always reshape to the square root of the length
but if it isn't, do you have a solution using textscan?
this is how far i got:
fid = fopen('blah.dat');
x1 = textscan(fid, '%n', 'TreatAsEmpty', 'X');
a = x1{:};
% this works well with the example
% but with a different size non-square matrix, it probably won't
A1 = reshape(a, floor(sqrt(length(a))), ceil(sqrt(length(a))))';
By the way, this is not me arguing, but trying to learn... :)
"but how do you solve the part about not knowing the size of the matrix?"
Read the first line using fgetl, count the delimiters, then frewind back to start of the file. Use the count to define a format string using repmat, then read the file using textscan. It sounds complex, but it isn't really:
opt = {'Delimiter',' ','TreatAsEmpty','X','CollectOutput',true};
[fid,msg] = fopen('temp1.txt','rt');
assert(fid>=3,msg)
cnt = nnz(strtrim(fgetl(fid))==' ');
frewind(fid)
fmt = repmat('%f',1,1+cnt);
C = textscan(fid,fmt,opt{:});
fclose(fid);
And checking (the test file is attached):
>> C{1}
ans =
17.0100 24.0200 1.0300 8.0400
23.0600 5.0700 7.0800 14.0900
4.0100 NaN 13.0300 20.0400
10.0600 12.0700 19.0800 21.0900
11.0100 18.0200 25.0300 2.0400
nice +1
still much faster

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

TADA
TADA 2018년 11월 22일
txt = fileread('blah.dat');
lines = strsplit(txt, newline);
x = regexp(lines, '[^ ]+', 'match');
items = cat(1,x{:});
A = str2double(items);
your X would now be represented by NaN in matrix A
dpb
dpb 2018년 11월 22일

0 개 추천

If you're going to read in text in random locations, unless you can determine or require the user to tell you where those locations are, the only alternative will be to read the whole array as cellstr array and then figure out after the fact "who's who in the zoo" as far as which locations are/aren't numeric.

카테고리

도움말 센터File Exchange에서 Characters and Strings에 대해 자세히 알아보기

제품

릴리스

R2015a

질문:

2018년 11월 21일

댓글:

2018년 11월 22일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by