read empty line by textscan

Question

0 개 추천

Hi Everyone,

I am trying to organize a txt file with 12000 lines, which is too large to use readtable. And i choose to use textscan.

But the problem is textscan just skip all the empty lines, but i need to the exact lines number of certain element in the original file.

I searched a lot online but didn't help. i tried code like this to delete all whitespace but doesn't help.

default = textscan(fid,'%s%s','Delimiter','=','whitespace', '')

Thank you for your help!

댓글 수: 2
없음 표시 없음 숨기기

Rik 2019년 4월 11일

Did you try either suggested solution? If you still have issues, we'll be happy to help.

Jeremy Hughes 2019년 4월 11일

I know someone has already added a solution, and it's a fine solution for what you're doing. But I'm surprised that READTABLE has a problem. Can you attach a sample?

12,000 lines isn't all that large especially if there are only two columns.

If you have 19a, you might also try:

M = readmatrix(filename,'OutputType','string','Delimiter','=','Whitespace','')

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Rik 2019년 4월 10일

편집: Rik 2019년 4월 10일

MATLAB Online에서 열기

2 개 추천

If your file doesn't contain any special characters, you could try fileread (which reads a file as one long char array), then split it with regexp. If you aren't sure about the encoding of special characters, you may consider my readfile function (which returns a cell array with 1 element per line, also for empty lines).

default = fileread(filename);
default = regexp(default,'\n','split');
%or:
default = readfile(filename);

The output of those two methods is equivalent if there are no special characters encoded in the file. The allowed characters are shown below. (readfile doesn't have this restriction)

% $%&'()*+,-./0123456789:;<=>?@
% ABCDEFGHIJKLMNOPQRSTUVWXYZ
% [\]^_`abcdefghijklmnopqrstuvwxyz{|}~

댓글 수: 5
이전 댓글 3개 표시 이전 댓글 3개 숨기기

Jeremy Hughes 2019년 4월 11일

편집: Jeremy Hughes 2019년 4월 11일

default = regexp(default,'\n','split');

This won't work if there are \r\n windows new lines (or at least you'll have trailing \r characters.)

If you're using 16b or later, try:

https://www.mathworks.com/help/matlab/ref/splitlines.html

default = splitlines(default);

It's a little more robust, and since it has only one job to do, probably slightly faster than regexp.

Rik 2019년 4월 11일

편집: Rik 2019년 4월 11일

MATLAB Online에서 열기

To make the regexp splitting more robust (which will be in my nest version of readfile):

CRLF=[13 10];
CRLF=CRLF([any(default==13) any(default==10)]);
if isempty(CRLF),CRLF=10;end
default = regexp(default,CRLF,'split');

splitlines will probably be faster, while the code I showed here is backwards compatible to R14 (v7.0, which was when regexp was expanded to support outkeys).

Edit:

I just noticed I had this line already in my function:

str(str==13)='';

So readfile already splits it correctly for \r\n files.

댓글을 달려면 로그인하십시오.

Answer 2

Bob Thompson 2019년 4월 10일

편집: Rik 2019년 4월 10일

MATLAB Online에서 열기

0 개 추천

I'm going to guess that the extra lines are not consistent?

Generally, I would suggest reading the entire file in as one string, then splitting it at the new line characters. The exact coding may be a bit off from the below example, but it should put you on the right track.

default = textscan(fid,'%s'); % Read the file as one block
default = regexp(default,'\n','split'); % Split the string into multiple cells at each new line character

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

Bob Thompson 2019년 4월 10일

Yes, I do. Thank you for catching that, I was using repmat for other things recently.

zhiwen wan 2019년 4월 11일

Thank you very much Bob, problem solved:)

댓글을 달려면 로그인하십시오.

read empty line by textscan

댓글 수: 2
없음 표시 없음 숨기기

채택된 답변

댓글 수: 5
이전 댓글 3개 표시 이전 댓글 3개 숨기기

추가 답변 (1개)

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

카테고리

제품

릴리스

태그

Community Treasure Hunt

read empty line by textscan

댓글 수: 2 없음 표시 없음 숨기기

채택된 답변

댓글 수: 5 이전 댓글 3개 표시 이전 댓글 3개 숨기기

추가 답변 (1개)

댓글 수: 3 이전 댓글 1개 표시 이전 댓글 1개 숨기기

카테고리

제품

릴리스

태그

참고 항목

Community Treasure Hunt

댓글 수: 2
없음 표시 없음 숨기기

댓글 수: 5
이전 댓글 3개 표시 이전 댓글 3개 숨기기

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기