I'm working with an inherited script that calls TEXTSCAN as follows:
allData = textscan(fid,'%s','Delimiter','@');
What does the at-sign delimiter parameter do, and is this documented anywhere?
I don't see anything in the TEXTSCAN help for this, but when I parse the same text file with and without that parameter specified, I get different results. The input file contains no explicit at-sign characters anywhere. Is TEXTSCAN treating the @ as some special control character?

댓글 수: 5

Mohammad Sami
Mohammad Sami 2020년 5월 8일
I think most likely explanation is, when you do not specify a Delimiter, Matlab will use the default delimiter. I believe the default delimiters are white space, i.e. spaces, tabs or newline characters.
If you specify a delimiter, Matlab will use the specified delimiter and you will get different results.
Walter Roberson
Walter Roberson 2020년 5월 8일
With the delimiter set to something that does not occur in the text, the effect would be to scan until end of file.
AMM
AMM 2020년 5월 8일
편집: AMM 2020년 5월 8일
Thanks, both, for the replies.
Walter, I'm not seeing what you describe—I see effects throughout the input file, not just at the end. If I have a plain-text file that contains no at-signs in it, and I perform the TEXTSCAN call above with and without the 'Delimiter','@' parameter/value arguments, I get significantly different results:
  • with 'Delimiter','@' (trimmed for compactness):
whos allData_withDelim, allData_withDelim(1), allData_withDelim{1},
Name Size Bytes Class Attributes
allData_withDelim 1x1 34684 cell
ans =
1×1 cell array
{133×1 cell}
ans =
133×1 cell array
{' 3.04 N: GNSS NAV DATA M: Mixed RINEX VERSION / TYPE'}
{'XXXXXXX XXXXX XXXX 20200101 123500 UTC PGM / RUN BY / DATE '}
...
  • without 'Delimiter','@' (similarly trimmed; note the CR/LF linebreaks in the last quoted line):
whos allData_noDelim ; allData_noDelim(1), allData_noDelim{1},
Name Size Bytes Class Attributes
allData_noDelim 1x1 21488 cell
ans =
1×1 cell array
{1×1 cell}
ans =
1×1 cell array
{' 3.04 N: GNSS NAV DATA M: Mixed RINEX VERSION / TYPE←↵XXXXXXX XXXXX XXXX 20200101 123500 UTC PGM / RUN BY / DATE ←↵ ...'}
It sure seems like calling TEXTSCAN with the P/V pair 'Delimiter','@' affects its handling of line endings—in other words, it seems to treat the at-sign as a special character, rather than as a literal one. (As I mentioned, this input file contains no at-signs anywhere.)
But I don't see this anywhere in the documentation, and I have no idea what's going on with TEXTSCAN "under the hood." Sorry to be obtuse, but is this possible?
Please attach your data file, and also the code you use to reproduce the problem.
The tests I have done find nothing special about using @ . The effect I get when I use any character not found in the file exactly the same as if I use
textscan(fid, '%s', 'Delimiter', '\n', 'Multiple', true)
or
textscan(fid, '%s', 'whitespace', '\n')
and the effect is:
  • each time the %s fires, skip all leading spaces and newlines
  • once the %s starts reading something non-blank, continue until the first newline
AMM
AMM 2020년 5월 12일
편집: AMM 2020년 5월 12일
Hi Walter,
Here you go. Here is what I'm seeing with the attached file:
>> fid=fopen('textscan_test.txt','rt');
>> out1=textscan(fid,'%s'); out1=out1{1}; frewind(fid);
>> out2=textscan(fid,'%s','Delimiter','@'); out2=out2{1};
>> out3=textscan(fid,'%s','whitespace','\n'); out3=out3{1}; fclose(fid);
>> whos
Name Size Bytes Class Attributes
ans 1x1 8 double
fid 1x1 8 double
out1 2700x1 351730 cell
out2 538x1 134220 cell
out3 538x1 134220 cell
As you can see, the attached file contains no at-signs.
Indeed, what seems to be happening is exactly what you describe: if textscan is given a delimiter that doesn't occur in the input, it falls back to the default behavior you mention above.

댓글을 달려면 로그인하십시오.

 채택된 답변

per isakson
per isakson 2020년 5월 13일

0 개 추천

I've reproduced your result on R2018b. The result is according to the textscan documentation - I think.
  • out1 is a cell array of character arrays with one item per cell
  • out2 is a cell array of character arrays with one data row per cell
Case 1. One or more spaces are used as delimiter. That's by default and regardless of the value of 'MultipleDelimsAsOne'. Doc says: If you do not specify a delimiter, then: the delimiter characters are the same as the white-space characters.
Case 2. '@' is used as delimiter. '%s' matches the entire row, since no delimiter is found. (I don't find a sentence in the documentation to copy. There is something about row-oriented that goes without saying.)

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Data Type Identification에 대해 자세히 알아보기

제품

릴리스

R2020a

태그

질문:

AMM
2020년 5월 7일

답변:

2020년 5월 13일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by