How do I find a folder with a specified string?

조회 수: 89(최근 30일)
Daniel Bridges
Daniel Bridges 2018년 1월 30일
편집: Stephen23 2018년 1월 31일
I think I need your help using regexp: My goal is to find the RTPLAN DICOM file and read particular metadata from it. Trying to get the full folder name to use in fullfile to use in dicominfo, I tried the following which failed with an error I don't understand:
>> result = regexp(listing.name,'RTPLAN','match')
Error using regexp
Invalid option for regexp:
doe^john_anon53250_ct_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n132__00000.
The folder containing the string 'RTPLAN' clearly exists as the penultimate entry in the following directory listing: Exporting anonymized patient data from MIM Maestro we get
>> DICOMdatafolder = '/home/sony/Documents/research/data/DICOMfiles/5';
listing = dir(DICOMdatafolder);
listing.name
ans =
'.'
ans =
'..'
ans =
'DOE^JOHN_ANON53250_CT_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n132__00000'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00000'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00001'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00002'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00003'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00004'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00005'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00006'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00007'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00008'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00009'
ans =
'DOE^JOHN_ANON53250_RTDOSE_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__0000A'
ans =
'DOE^JOHN_ANON53250_RTPLAN_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00000'
ans =
'DOE^JOHN_ANON53250_RTst_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00000'
So the task I'm trying to accomplish is: Given a list of folder names like this, grab the one that contains 'RTPLAN' so it can be used in fullfile. What was wrong with my use of regexp?

채택된 답변

Stephen23
Stephen23 2018년 1월 30일
편집: Stephen23 2018년 1월 30일
The problem is that listing.name expands to a comma-separated list, so your code
regexp(listing.name,'RTPLAN','match')
is exactly equivalent to
regexp(listing(1).name, listing(2).name, listing(3).name, listing(4).name, listing(5).name, listing(6).name, ... , 'RTPLAN','match')
where each element of the structure listing supplies one name field as an input argument to regexp: this clearly produces far too many inputs for regexp, and those inputs are supplied in meaningless positions as well, thus the error.
Comma-separated lists were introduced in my answer to your earlier question:
The solution is to put all of those elements of that list into one cell array, e.g.:
result = regexp({listing.name},'RTPLAN','match')
where
{listing.name}
is of course equivalent to
{listing(1).name, listing(2).name, listing(3).name, ...}
This is explained in the MATLAB documentation that I linked to in my earlier answer. I would recommend reviewing what comma-separated lists are, because judging by your other question they are causing you some confusion (in particular comma-separated lists are not one variable). You might like to start here:
  댓글 수: 4
Stephen23
Stephen23 2018년 1월 31일
편집: Stephen23 2018년 1월 31일
Because in this case the input to regexp is a cell array of strings the output is a cell array of the same size: one of the cells would be non-empty (containing either the matching string, the substring, or its index, depending on what output you select, and assuming one matched filename). You would then have to do some post-processing to get the contents of that one cell, such as checking which cell is empty to generate a logical index:
>> C = {listing.name};
>> idx = ~cellfun('isempty',regexp(C,'RTPLAN','once'));
>> C{idx}
ans = DOE^JOHN_ANON53250_RTPLAN_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00000
However for matching such a simple substring regexp is overkill: here are two ways to match that filename, based on faster strfind:
From cell array:
>> C = {listing.name};
>> idx = ~cellfun('isempty',strfind(C,'RTPLAN' ));
>> C{idx}
ans = DOE^JOHN_ANON53250_RTPLAN_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00000
From structure:
>> idx = ~cellfun('isempty',strfind({listing.name},'RTPLAN' ));
>> listing(idx).name
ans = DOE^JOHN_ANON53250_RTPLAN_2013-02-04_100401_for.use.in.interfractional.blurring.study_planned.treatment_n1__00000
To which you should also add some error checking (otherwise the last step could produce multiple variables in a comma-separated list), so whichever one you choose put this immediately after idx is defined:
assert(nnz(idx)==1,'less than or more than one file found')

댓글을 달려면 로그인하십시오.

추가 답변(0개)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by