How do I import Velocity 3.2.0 CSV DVH data into MATLAB 9.1 (R2016b)?

조회 수: 4 (최근 30일)
How do I import radiation oncology software Velocity 3.2.0's dose-volume histogram (DVH) data in a comma-separated value file (CSV, sample file attached) into MATLAB 9.1 (R2016b)? Using Velocity one can create DVH data for multiple tissues displayed in a single graph, and export this data as a sequential two-column CSV.
csvread requires that "the file must contain only numeric values", whereas the CSV is two columns of data sets that begin with header text and end with an empty row.
It appears that for this reason a simple execution of importdata is insufficient, because the command terminates after the importing only the first data set:
test = importdata('filename.csv');
test =
struct with fields:
data: [1024×2 double]
textdata: {2×1 cell}
whereas the file actually contains additional data sets (e.g. copying from row 1026):
58.1704 0.00692086
Prostate
GY (CC)
55.2304 0.0046139
55.2333 0.00230695
What do we use to import data in CSV that is formatted as follows? (The following describes what is seen using Excel 2016.)
header text in Column 1
header text in Columns 1 and 2
numerical data in columns 1 and 2 in multiple rows
empty row
(repeat for next data set for multiple data sets of various length)
Walter Roberson requested a sample data file and provided a solution below using fopen, fgetl, feof, and textscan.
  댓글 수: 4
Walter Roberson
Walter Roberson 2017년 1월 5일
Velocity appears to be from Varian. Varian advertises,
"Velocity provides a vendor-neutral platform that integrates image, structure, plan and dose data to create a unified patient dataset." Unfortunately their documentation is a bit sparse as to what that format is. Except they mention DICOM, and they mention RT Plan software. Someone has written software to read DICOM RT Plan data in MATLAB; see https://github.com/ulrikls/dicomrt2matlab
It sounds like your data is not DICOM based.
As I poke around, the information I am finding about DVH suggests that the most common formats are not what you are describing your file as having. But it is difficult to tell, as you have not given an example file.
Walter Roberson
Walter Roberson 2017년 1월 5일
There are millions of file formats. People invent their own more often than they use standard formats, and they modify the file format over time, often without considering backwards capability. There is no practical way for Mathworks to already support them all.
Mag tape was always written in records, often fixed length binary records. Variable length records did exist but when it came time to start a new data structure, typically a new record was written. Not inevitably though: packing multiple structures into one tape record did happen. Remember though that memory was typically not large and a complete record at a time has to be read in for mag tape (no positioning by bytes), so the variable length records did not pack long continuous streams in like became common on disc files.

댓글을 달려면 로그인하십시오.

채택된 답변

Walter Roberson
Walter Roberson 2017년 1월 5일
There is no pre-written Mathworks routine to read that file format. It is however not difficult to write coode for it.
num = 0;
fid = fopen('sample.csv','rt');
while true
H1 = fgetl(fid) ;
if feof(fid); break; end
H2 = fgetl(fid) ;
if feof(fid); break; end
datacell = textscan(fid, '%f%f', 'delimiter', ',', 'combineoutput', true) ;
if isempty(datacell) || isempty(datacell{1}); break; end
num = num + 1;
headers(num) = {H1, H2} ;
data(num) = datacell;
fgetl(fid); %the empty line between organs
end
This will create two cell arrays, one of headers and the other of corresponding numeric values. You might want to do some processing on H1 (organ name) and H2 (not sure what that line is for) before storing that information.
  댓글 수: 7
Daniel Bridges
Daniel Bridges 2017년 1월 7일
편집: Daniel Bridges 2017년 1월 7일
Is it not more legible and memory-efficient to put it immediately after textscan's cell array creation?
datacell = textscan(fid,'%f%f','delimiter',',','collectoutput',true);
if isempty(datacell) || isempty(datacell{1}); break; end
if any(isnan(datacell{1}(end,:))); datacell{1}(end,:) = []; end
Walter Roberson
Walter Roberson 2017년 1월 7일
No, it is the same efficiency. But it certainly does not hurt to have it closer to where datacell is created.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Large Files and Big Data에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by