How to download multiple files from a website

조회 수: 30 (최근 30일)

Chad Greene 2023년 11월 21일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2050267-how-to-download-multiple-files-from-a-website

댓글: Dyuman Joshi 2023년 11월 22일

This question has been asked many times in various ways on this forum, but I've never found a simple answer to this very simple question:

How do I download all of the .nc files listed here? https://www.ngdc.noaa.gov/thredds/catalog/global/ETOPO2022/15s/15s_surface_elev_netcdf/catalog.html

It seems like there should be a two-line solution along the lines of :

url_list = get_urls('https://www.ngdc.noaa.gov/thredds/catalog/global/ETOPO2022/15s/15s_surface_elev_netcdf/catalog.html','extension','.nc'); 
websave(url_list)

if get_urls were a function and websave were as easy to use as entering a list of file urls to download and having it save them in the current directory.

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Chad Greene 2023년 11월 21일

Wow, thank you @Dyuman Joshi!

Dyuman Joshi 2023년 11월 22일

You are welcome!

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

채택된 답변

Voss 2023년 11월 21일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2050267-how-to-download-multiple-files-from-a-website#answer_1357512

MATLAB Online에서 열기

url = 'https://www.ngdc.noaa.gov/thredds/catalog/global/ETOPO2022/15s/15s_surface_elev_netcdf/catalog.html';
% webread() the main page and parse out the links to .nc files:
data = webread(url);
C = regexp(data,'<a href=".*?(\?[^"]*.nc)">','tokens');
temp_urls = strcat(url,vertcat(C{:}));
% webread() each linked url:
data = cell(size(temp_urls));
for ii = 1:numel(temp_urls)
    data{ii} = webread(temp_urls{ii});
end
% get the download link in each of those pages:
C = regexp(data,'<a href="([^"]*)">\s*<b>HTTPServer','tokens','once');
% append them to the (sub-)domain of the main URL to get the actual URLs 
% for downloading the .nc files:
idx = find(url == '/',3);
nc_urls = strcat(url(1:idx(end)-1),vertcat(C{:}));
% construct file names to save to locally:
[~,filenames,ext] = fileparts(nc_urls);
filenames = strcat(filenames,ext);
% download all the files:
for ii = 1:numel(nc_urls)
    websave(filenames{ii},nc_urls{ii});
end

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Voss 2023년 11월 21일

You're welcome!

Each link on the main page goes to a distinct intermediate page which contains the link to download the actual .nc file.

The first webread/regexp gets the set of urls to those intermediate pages. Then webread each of those intermediate pages in a loop, and regexp all the contents to get the download urls (which is the url immediately preceding 'HTTPServer' on each intermediate page - there are several other urls on those pages, and that was the only way I could think of to be sure to get the right one).

Chad Greene 2023년 11월 22일

Ooh, okay, that makes a lot of sense. Thanks @Voss!

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

이 질문에 답변하려면 로그인하십시오.

카테고리

MATLAB Installation and Licensing Downloads

Help Center 및 File Exchange에서 Downloads에 대해 자세히 알아보기

제품

MATLAB

릴리스

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

How to download multiple files from a website

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

채택된 답변

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

How to download multiple files from a website

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

채택된 답변

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기