Why does Matlab transpose hdf5 data?
이전 댓글 표시
There is an apparent bug in Matlab HDF5 read/write utility that breaks interoperability with other code. Simple array datasets are read/written as the transpose of their actual shape. I imagine this is because Matlab uses column-major (Fortran-style) order, whereas the HDF5 standard uses row-major (C-style) order.
Minimal example that illustrates the problem:
h5create('test.h5', '/dataset', [2,3]);
h5write('test.h5', '/dataset', reshape(1:6,[2,3]))
Running the HDF5 utility h5ls on the output reveals the problem:
$ h5ls test.h5
dataset Dataset {3, 2}
This is not evident if only using the HDF5 tools from within Matlab, since reading the dataset in also transposes it back.
>> h5read('test.h5', '/dataset')
ans =
1 3 5
2 4 6
Matlab should either fix this in future versions or mention the convention in the documentation, since people mostly choose HDF5 for interoperability with other systems, and this can be a tricky bug to find.
In versions:
- h5ls: Version 1.8.14
- Matlab 8.6.0.267246 (R2015b) GLNXA64
댓글 수: 1
Daniel Döhring
2019년 5월 24일
편집: Daniel Döhring
2019년 5월 24일
Actually this bug seems to be still around. In my case, a (pseudo) multiarray of dimensions
is in Matlab internally permuted to
. As a consequence, it is impossible to write back a multiarray in dimensions
, since Matlab does not represent matrices in
manner.
채택된 답변
추가 답변 (3개)
Kameron Harris
2016년 10월 20일
편집: Kameron Harris
2016년 10월 20일
1 개 추천
Kameron Harris
2016년 10월 20일
편집: Kameron Harris
2016년 10월 20일
0 개 추천
댓글 수: 1
James Tursa
2016년 10월 20일
The HDF Group intent seems to be that applications should be able to write to the file in a native storage order. This seems reasonable to me, especially from a speed standpoint. Why cripple column-ordered languages (Fortran, MATLAB) with a hard requirement to permute the data each time you read/write?
Kameron Harris
2016년 10월 20일
편집: Kameron Harris
2016년 10월 20일
0 개 추천
댓글 수: 2
James Tursa
2016년 10월 20일
Well, so this pretty much answers the question. The HDF Group intended the various applications (Fortran, MATLAB, C, C++, Python, etc) to be able to write to the file in a native storage order and simply list the dimensions of the data in the file in a specified order (slowest changing first ... fastest changing last). It is then incumbent on the user to know what storage order his/her applications use if they are to share data through this file format ... and permute the data accordingly if necessary.
So given this language in the HDF doc, I would say MATLAB is doing everything correctly (but maybe could help the user out with some documentation about interoperability with other languages/applications).
Kameron Harris
2016년 10월 20일
카테고리
도움말 센터 및 File Exchange에서 HDF5에 대해 자세히 알아보기
제품
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!