- Memory efficiency is important:
- Processing data in a streaming fashion:
- Parallel processing:
What is the difference between readall and read+hasdata?
조회 수: 15 (최근 30일)
이전 댓글 표시
I found that the functions of readall and read+hasdata seem to be exactly the same. read+hasdata is a loop body, is it less efficient? So in any case you should avoid using read+hasdata? Why does matlab also provide the hasdata function?
In what scenario is it more meaningful to use read+hasdata?
ds = datastore('mapredout.mat');
while hasdata(ds)
T = read(ds);
end
ds = datastore('mapredout.mat');
readall(ds)
댓글 수: 0
답변 (1개)
Mrutyunjaya Hiremath
2023년 7월 21일
The functions `readall` and `read` with `hasdata` are used for reading data from datastores. These functions are not exactly the same, and they serve different purposes.
Using `read` with `hasdata` can be more meaningful and efficient in scenarios where:
`readall` is suitable for smaller datasets that can fit into memory, while `read` with `hasdata` is more appropriate for larger datasets or scenarios where memory efficiency and streaming processing are important.
댓글 수: 10
Walter Roberson
2023년 7월 23일
Your file name mapredout.mat hints that the .mat file might be the output of a mapreduce() call . If so then it is a Key-Value Datastore https://www.mathworks.com/help/matlab/ref/matlab.io.datastore.keyvaluedatastore.html . Key-Value datastores default to
ReadSize — Maximum number of key-value pairs to read
1 (default) | positive integer
Maximum number of key-value pairs to read in a call to the read or preview
functions, specified as a positive integer.
So any one read() call on the datastore is not going to read all of the data.
The particular datastore you are using might have been configured for a larger ReadSize, but the ReadSize cannot be set to be infinite -- in general when you read() from a datastore, even one configured with only a single .mat file, the read() might not read in all of the data if the datastore is large enough . Whereas readall() will always read all of the data, provided that it does not run out of memory.
For testing purposes, I suggest you experiment with
while hasdata(str)
T = read(str)
end
T
and see whether the read() is being called more than once, and if so whether the T at the end has all of the data that was read in. Depending on the kind of datastore and how big it is, sometimes a single read() is enough to read in all of the data; other datastores might need to read the data in chunks when you read(), and other datastores might only read one file at a time if the datastore has multiple files.
참고 항목
카테고리
Help Center 및 File Exchange에서 Workspace Variables and MAT Files에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!