Converting Categorical Array/Table to Numerical

조회 수: 89 (최근 30일)
Claire Hollow
Claire Hollow 2020년 6월 10일
댓글: Adam Danz 2022년 9월 23일
Hello!
I have a 23000x4 set of data that is a table (called temp). It is all numbers except for the times data was missing it was filled in with 'NA', therefore the table is categorial. I am looking to change it to a numerical and have all the 'NA' changed to 'NaN' so I can run max and min and those kinds of things on it without running into issues. Thank you for the help!
  댓글 수: 12
Claire Hollow
Claire Hollow 2020년 6월 10일
The data I need to use came from a climate station in a csv file, when I imported it it was a categorial because of its contents.
Adam Danz
Adam Danz 2020년 6월 10일
What function did you use to import it? You can control the class of the data you're importing. A csv file doesn't control that for numeric values.

댓글을 달려면 로그인하십시오.

채택된 답변

Adam Danz
Adam Danz 2020년 6월 10일
편집: Adam Danz 2022년 9월 23일
By far the best solution is to avoid representing numeric values as categorical values in the first place. If you can un-do that, that's the best solution.
If that cannot be done, here's how to convert categorical values that contain numeric values in a table.
Table T can contain mixed classes (some classes may cause errors).
This demo detects which columns of T contains values that can be converted to numers. It then creates an output table T_converted that contains the num-categorical-number columns of T and the categorical-number columns converted to numbers.
% Create demo table with a mix of stings, categorical numerals, and numbers
T = table(["A";"B";"C";"D";"E"], ...
categorical(randi(10,5,1)), ...
randi(10,5,1), ...
categorical(randi(10,5,1)), ...
'VariableNames', {'A','B','C','D'})
T = 5×4 table
A B C D ___ __ _ _ "A" 2 3 4 "B" 3 4 8 "C" 9 6 8 "D" 9 4 5 "E" 10 5 6
varfun(@class, T)
ans = 1×4 table
class_A class_B class_C class_D _______ ___________ _______ ___________ string categorical double categorical
% Determine which columns are categoricals
% NOTE: This assumes you want to convert all categorical table variables
% to numeric. Otherwise, additional column indexing will be needed.
iscat = varfun(@iscategorical, T,'OutputFormat','Uniform');
% Convert the categorical table variables to numeric
Tnum = array2table(str2double(string(T{:,iscat})), ...
'VariableNames', T.Properties.VariableNames(iscat));
% Create an updated table with the converted data and maintain
% original column order
T_converted = [T(:, ~iscat), Tnum];
[~,colorder] = ismember(T_converted.Properties.VariableNames, T.Properties.VariableNames);
T_converted(:,colorder)
ans = 5×4 table
A B C D ___ __ _ _ "A" 2 3 4 "B" 3 4 8 "C" 9 6 8 "D" 9 4 5 "E" 10 5 6
varfun(@class, T_converted)
ans = 1×4 table
class_A class_C class_B class_D _______ _______ _______ _______ string double double double
This answer was corrected on 9/23/22; thanks to Ahmed Rady for pointing out the problem
  댓글 수: 2
Ahmed Rady
Ahmed Rady 2022년 9월 23일
편집: Ahmed Rady 2022년 9월 23일
Hi Adam
Thanks for the code
But it seems that the numeric values in the original table were transformed.
I thought the aim is to only change the catergorical variables.
Adam Danz
Adam Danz 2022년 9월 23일
Thanks @Ahmed Rady, I've updated the answer to correct the mistake.
My previous answer involved applying double() to the categorical-numbers which converts the categoricals into a grouping number rather than the numbers represented within the categories.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Data Type Conversion에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by