MATLAB Answers


How to read just a part of a binary file with a predefined end position or a predefined amount of Bytes?

Sebastian 님이 질문을 제출함. 12 Dec 2018
최근 활동 Sebastian 님이 편집함. 13 Dec 2018
Hi. I have searched a lot to find the answer, but was not successful.
I want to get data records ({'uint16' 'uint16' 'uint16' 'uint8' 'uint8'} = 8 Bytes) out of a binary file.
The files have millions of records with 1 min time steps and a given start date.
Up to now, I was able to define the start position by skipping the wanted time duration (1 record of 8 Bytes = 1 min) with fseek.
My problem is, that I can not find a solution how to define the end position or the amount of records for fread.
One solution would be to use a Loop in which the record length is added to fseek each run and the rest of the file is skipped after every record. But this is grossly inefficient and likely would need even more time than reading the whole file and picking the wanted part out of the resulting matrix, I guess.
I hope you understand what I want to ask...
I need something like fread(fileID,start_position,end_position or number of records).
Thanks in advance.

  댓글 수: 3

I'd use fseek and fread like you did. Why do you say it's "grossly inefficient"? Do you have evidence you can share that it's inefficient, or really slower than using fread and throwing away the values? Have you seen memmapfile()?
Hi Image Analyst,
I don't have evidence, but since it takes more time to get every cell from a matrix by using a loop-function than directly accessing the matrix, I concluded that the loop-attempt would increase the needed time also for this purpose.
I just read the memmapfile instruction. As far as I understood also with this function only comes the possibility to define an offset to skip the first n Bytes, but not the possibility to define a number of wanted records or an end position. I just can't understand why they did not add such an input argument when they implemented the offset argument...
The thing is, I know that it does not take ages to read a binary file. In my case it takes 15 to 20 seconds... But I just started a new project and will have to use this function a lot of times for the next 3 years. So saving a few seconds each time will add up to a not insignificant amount of time.
I deal with 3-D CT images of up to 20 GB in size and I use fseek() and fread() to read slices out of the middle of the file and it's pretty quick, like a second or two. I'm not aware of any other ways, so you might call the Mathworks and ask them. How big are your files?

로그인 to comment.




답변 수: 1

Guillaume 님의 답변 12 Dec 2018
 채택된 답변

I'm not entirely sure I completely understand, maybe that's what you want:
recordstart = ??? %some integer value. Index of first desired record
numrecords = ??? %how many records to get
filepath = ??? %path of the file
recordtypes = {'uint16', 'uint16', 'uint16', 'uint8', 'uint8'};
recordsizes = [2, 2, 2, 1, 1]; %size of each type in bytes. Must match recordtypes
fid = fopen(filepath, 'r')
fseek(fid, (recorstart - 1) * sum(recordsizes), 'bof');
data = fread(fid, [sum(recordsizes), numrecords], '*uint8'); %read numrecords as uint8
data = mat2cell(data, recordsizes, numrecords);
data = cellfun(@(bytes, data) typecast(bytes(:), data), data', recordtypes, 'UniformOutput', false);

  댓글 수: 1

Thanks a lot!
That's what I wanted. In the beginning of writing my function I came across the input argument 'sizeA'. But I searched on and forgot about it. I think I falsely assumed that this command would still read the whole binary file and then just rearange the output...

로그인 to comment.

Translated by