필터 지우기
필터 지우기

Read string from files since R2020a

조회 수: 12 (최근 30일)
Radek
Radek 2020년 4월 20일
댓글: Radek 2020년 4월 22일
I have a large binary data file with some ASCII formatted metadata header at the beginning. To read this header, I use 'string-oriented' read functions like fgetl(~), fscanf(~, '%s', ~) or fread(~, ~, '*char'). In Matlab versions prior to R2020a (I have R2014b and R2019b) this worked just fine, however in the R2020a something changed.
Now the very first, but only the first, attempt to read any string from the file will use extensive amount of memory and freezes the whole thing. I have a guess that Matlab is trying to read the whole file into memory. And in my case the file itself is larger than available RAM which probably cause the freezing.
Here what I do:
% Here everything works just fine
fd = fopen('file.name', 'r');
arr1 = fread(fd, 1);
arr2 = fread(fd, 1);
fclose(fd);
% Here I have a problem
fd = fopen('file.name', 'r');
arr1 = fread(fd, 1); % fast and smooth
arr2 = fread(fd, 1, '*char'); % uses extensive amount of RAM and slow
arr3 = fread(fd, 1, '*char'); % fast and smooth again
fclose(fd);
1) It does not matter what part of the file I read.
2) All numeric type returning read functions are always fast.
3) The first string returning read function is always slow and does not matter what function I use (as long as it returns string).
4) All successive string reads are as fast as numeric ones.
5) Once the read function returns the string the memory is released.
6) File position pointer is always at expected position (does not move to end of the file).
7) It does not matter if the file is opened in text or binary mode.
8) The issue is presented both on Windows and Linux.
Any idea?

채택된 답변

Sindar
Sindar 2020년 4월 22일
From the release notes:
"As of R2020a, character-oriented file I/O functions such as fscanf, fgets, and fgetl trigger automatic character set detection when reading a file that was opened using fopen without a specified encoding."
My suspicion then is that the "automatic character set detection" may require looking through the full file.
Try specifying the encoding in fopen, e.g.,
fd = fopen('file.name', 'r','n','UTF-8');
  댓글 수: 2
Radek
Radek 2020년 4월 22일
I can conform that specifying encoding solves the issue, thank you Sindar.
Next time, I should check release notes more carefully.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Low-Level File I/O에 대해 자세히 알아보기

제품


릴리스

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by