What is the difference between readall and read+hasdata?

조회 수: 15 (최근 30일)
fa wu
fa wu 2023년 7월 20일
편집: fa wu 2023년 7월 24일
I found that the functions of readall and read+hasdata seem to be exactly the same. read+hasdata is a loop body, is it less efficient? So in any case you should avoid using read+hasdata? Why does matlab also provide the hasdata function?
In what scenario is it more meaningful to use read+hasdata?
ds = datastore('mapredout.mat');
while hasdata(ds)
T = read(ds);
end
ds = datastore('mapredout.mat');
readall(ds)
I think this comment is help:comment1 comment2

답변 (1개)

Mrutyunjaya Hiremath
Mrutyunjaya Hiremath 2023년 7월 21일
The functions `readall` and `read` with `hasdata` are used for reading data from datastores. These functions are not exactly the same, and they serve different purposes.
Using `read` with `hasdata` can be more meaningful and efficient in scenarios where:
  1. Memory efficiency is important:
  2. Processing data in a streaming fashion:
  3. Parallel processing:
`readall` is suitable for smaller datasets that can fit into memory, while `read` with `hasdata` is more appropriate for larger datasets or scenarios where memory efficiency and streaming processing are important.
  댓글 수: 10
Walter Roberson
Walter Roberson 2023년 7월 23일
Your file name mapredout.mat hints that the .mat file might be the output of a mapreduce() call . If so then it is a Key-Value Datastore https://www.mathworks.com/help/matlab/ref/matlab.io.datastore.keyvaluedatastore.html . Key-Value datastores default to
ReadSize — Maximum number of key-value pairs to read
1 (default) | positive integer
Maximum number of key-value pairs to read in a call to the read or preview
functions, specified as a positive integer.
So any one read() call on the datastore is not going to read all of the data.
The particular datastore you are using might have been configured for a larger ReadSize, but the ReadSize cannot be set to be infinite -- in general when you read() from a datastore, even one configured with only a single .mat file, the read() might not read in all of the data if the datastore is large enough . Whereas readall() will always read all of the data, provided that it does not run out of memory.
For testing purposes, I suggest you experiment with
while hasdata(str)
T = read(str)
end
T
and see whether the read() is being called more than once, and if so whether the T at the end has all of the data that was read in. Depending on the kind of datastore and how big it is, sometimes a single read() is enough to read in all of the data; other datastores might need to read the data in chunks when you read(), and other datastores might only read one file at a time if the datastore has multiple files.
fa wu
fa wu 2023년 7월 24일
Thank you for your guidance and help. I'll do an experiment

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Workspace Variables and MAT Files에 대해 자세히 알아보기

제품


릴리스

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by