problem reading HDF5 on s3

조회 수: 10 (최근 30일)
Ben Dichter
Ben Dichter 2022년 2월 22일
편집: Ben Dichter 2022년 11월 6일
>> H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9')
Error using hdf5lib2
Unable to access
'https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9'.
The specified URL scheme is invalid.
Error in H5F.open (line 130)
file_id = H5ML.hdf5lib2('H5Fopen', filename, flags, fapl, is_remote);
This works when using the same URL with h5py in Python:
from h5py import File
file = File("https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9", "r", driver="ros3")
file.keys()
<KeysViewHDF5 ['acquisition', 'analysis', 'file_create_date', 'general', 'identifier', 'processing', 'session_description', 'session_start_time', 'specifications', 'stimulus', 'timestamps_reference_time', 'units']>

채택된 답변

Ben Dichter
Ben Dichter 2022년 11월 6일
편집: Ben Dichter 2022년 11월 6일
Two things needed to solve this:
  1. You need to input an s3 path, not an http or https path
  2. Delete or rename ~/.aws/credentials (on Windows something like C:/Users/username/.aws/credentials)

추가 답변 (1개)

Yongjian Feng
Yongjian Feng 2022년 2월 23일
In python, you seem to use read_only flag ("r"). Maybe you want to try:
H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9', 'H5F_ACC_RDONLY')
  댓글 수: 1
Ben Dichter
Ben Dichter 2022년 2월 23일
That does not appear to be the issue. The documentation for H5F indicates that it should be possible to include only the s3 path:
file_id = H5F.open(URL) opens the hdf5 file at a remote location
for read-only access and returns the file identifier, file_id.
Also, this use-case is demonstrated in an example:
Example: Open a file in Amazon S3 in read-only mode with
default file access properties.
H5F.close(fid);
I tried some optional arguments, which did not help:
>> H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9', 'H5F_ACC_RDONLY')
Not enough input arguments.
Error in H5F.open (line 130)
file_id = H5ML.hdf5lib2('H5Fopen', filename, flags, fapl, is_remote);
>> H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9', 'H5F_ACC_RDONLY', 'H5P_DEFAULT')
Error using hdf5lib2
Unable to access
'https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9'. The
specified URL scheme is invalid.
Error in H5F.open (line 130)
file_id = H5ML.hdf5lib2('H5Fopen', filename, flags, fapl, is_remote);
It looks like MATLAB is doing validation on the input, ensuring that the path starts with s3://, which mine does not.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 HDF5에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by