필터 지우기
필터 지우기

Stop xlsread converting hex values to double

조회 수: 4 (최근 30일)
kevin
kevin 2018년 6월 4일
댓글: kevin 2018년 6월 7일
Reading in data using xlsread the raw values are not exactly raw. The cells contain hex data and whenever e is present (2e39) it is converting it to a double as 2.000e+39. Is there anyway to force it to read in as '2e39'? I found were it's being read in xlsreadCOM as rawData = DataRange.Value, I also see that VBA has xlRangeValueXMLSpreadsheet that is supposed to read in the exact value but I'm not sure how to apply that in Matlab. Using Matlab2012a

채택된 답변

Walter Roberson
Walter Roberson 2018년 6월 4일
We have seen this difficulty for .xlsx files in the past. .xlsx files code all numbers as strings, so MATLAB normally converts to numeric any string that str2double() says looks numeric enough to convert to double. This causes various problems, including the one you indicated, and including losing leading 0's on strings that just happen to consist entirely of numbers.
My investigation indicates that there are two ways of storing strings in .xlsx files, one that uses a "shared strings" table, and the other that uses (for lack of better term) "immediate" strings. Strings that are in the "shared strings" table do not seem to be processed the same way that "immediate" strings are.
However.. even for .xls files, there is a challenge. The [num, txt, raw] processing that is done mostly happens by attempting to convert to numeric form, and if the numeric conversion works then the entry is filled into the num table and the corresponding raw entry has the converted value; entries that cannot be converted are put in the txt table and the corresponding raw entry is left as text. This means that if you have a string that looks like a number, it may get converted.
We find that sometimes using readtable() works better at getting the correct data type.
  댓글 수: 9
Walter Roberson
Walter Roberson 2018년 6월 6일
NumHeaders = 17; %adjust as needed, can be 0
S = fileread('AppropriateFileName.csv');
filelines = regexp(S, '\r?\n', 'split');
if isempty(filelines{end})); filelines(end) = []; end %very common that file ends with \n leading to empty file field
filelines(1:NumHeaders) = [];
filefields = regexp(filelines, ',', 'split');
Now filefields is a cell array, and each entry in it is a cell array with as many entries as there were fields.
This structure can be less fun to deal with than some other structures, but if you need to process the lines with fewer or more fields differently than the other lines, then you need to maintain something similar to this (though possibly you might want to look for groups of lines with the same number of fields and merge them into blocks.)
It is practical to proceed from here to
empty_field = ''; %could also be numeric
numfields = cellfun(@length, filefields);
maxfields = max(numfields);
pad = repmat({empty_field}, 1, maxfields);
FirstFields = @(S) S(1:maxfields);
PadField = @(S) FirstFields( [S, pad] );
padded_fields = cellfun(PadField, filefields, 'Uniform', 0);
data = vertcat(padded_fields{:});
Now data would be a rectangular cell array in which all lines have been padded out to the maximum number of fields used in the file, with the empty fields using the content of empty_field as the placeholder
kevin
kevin 2018년 6월 7일
That looks like it. Thanks a lot this was really helpful.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Text Data Preparation에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by