Categorical to Numeric problem

조회 수: 13 (최근 30일)
Stephen Gray
Stephen Gray 2024년 1월 8일
댓글: Cris LaPierre 2024년 1월 11일
Hi
I have a table that has numeric and categorical items in it. I have converted the catergorical items to numeric using the unique() function which works very well and I can then feed the matrix into an NN for training. The problem is when I feed new data to get results, I don't know how to make sure the converted categirical data in the new table matches ther numbers in the training data. i.e. if a categorical field in the training data is converted to the number 5, how do I make sure if that categorical data is in the new data, that it gets assigned the same number? I'm begining to think it may be a manual thing
SPG

채택된 답변

Hassaan
Hassaan 2024년 1월 8일
% Example Training Data (Categorical)
training_categorical_data = {'cat', 'dog', 'fish', 'dog', 'cat'};
% Convert Categorical Data to Numeric for Training
[unique_categories, ~, numeric_categories] = unique(training_categorical_data);
category_to_number_map = containers.Map(unique_categories, num2cell(1:length(unique_categories)));
numeric_training_data = cell2mat(values(category_to_number_map, num2cell(training_categorical_data)));
% Training Process with numeric_training_data
% [Your neural network training code goes here]
% Example New Data (Categorical)
new_categorical_data = {'dog', 'cat', 'bird'};
% Convert New Categorical Data to Numeric Using Training Mapping
numeric_new_data = zeros(size(new_categorical_data));
for i = 1:length(new_categorical_data)
if isKey(category_to_number_map, new_categorical_data{i})
numeric_new_data(i) = category_to_number_map(new_categorical_data{i});
else
% Handle unseen categories, e.g., assign a special number or ignore
numeric_new_data(i) = NaN; % Assign NaN for unseen categories
end
end
% Now, numeric_new_data is ready for use with the trained model
% [Your prediction or evaluation code goes here]
  • The training data training_categorical_data is a cell array of categorical strings. This is converted to numeric_training_data using a mapping (category_to_number_map).
  • The new data new_categorical_data is then converted using the same mapping. Unseen categories (like 'bird' in this example) are handled separately; here, I've assigned NaN to them, but you can choose another method as appropriate.
  • You'll need to insert your specific neural network training and prediction code where indicated. The numeric_training_data and numeric_new_data arrays are what you'd use for training and prediction, respectively.
------------------------------------------------------------------------------------------------------------------------------------------------
If you find the solution helpful and it resolves your issue, it would be greatly appreciated if you could accept the answer. Also, leaving an upvote and a comment are also wonderful ways to provide feedback.
Professional Interests
  • Technical Services and Consulting
  • Embedded Systems | Firmware Developement | Simulations
  • Electrical and Electronics Engineering
  댓글 수: 4
Stephen Gray
Stephen Gray 2024년 1월 10일
OK, using dictionary instead and it's working so far.
Stephen Gray
Stephen Gray 2024년 1월 11일
OK. I've got it to work now using dictionaries. Both this answer and the next one helped me get it working. AS yours includes how to use new data to I'll mark it as the answer. Thanks both for answering.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Cris LaPierre
Cris LaPierre 2024년 1월 8일
이동: Cris LaPierre 2024년 1월 8일
Could you provide more details about your NN? I would think you should be able to pass categorical data into your network without having to convert it to numeric first.
If not, then I'd look into creating a dictionary, where you pass in the categorical value, and it returns the numberic value.
A = categorical({'medium' 'large' 'small' 'medium' 'large' 'small'});
names = unique(A)
names = 1×3 categorical array
large medium small
values = (1:length(names));
d = dictionary(names,values)
d = dictionary (categorical --> double) with 3 entries: large --> 1 medium --> 2 small --> 3
A(4)
ans = categorical
medium
x = d(A(4))
x = 2
  댓글 수: 4
Stephen Gray
Stephen Gray 2024년 1월 9일
Unfortunately not. The code part is
InpsM = table2cell(Inps);
OutsM =table2cell(Outs);
InpsM=InpsM';
OutsM=OutsM';
net=feedforwardnet([96,48,24]);
net.trainFcn = 'trainlm';
net.inputs{1}.processFcns = {'mapstd'};
net=train(net,InpsM,OutsM,'useParallel','yes');
The error I get is
Error using nntraining.setup>setupPerWorker
Inputs X{1,1} is not numeric or logical.
Error in nntraining.setup (line 77)
[net,data,tr,err] = setupPerWorker(net,trainFcn,X,Xi,Ai,T,EW,enableConfigure);
Error in network/train (line 336)
[net,data,tr,err] = nntraining.setup(net,net.trainFcn,X,Xi,Ai,T,EW,enableConfigure,isComposite);
Error in untitled (line 52)
net=train(net,InpsM,OutsM,'useParallel','yes');
SPG
Cris LaPierre
Cris LaPierre 2024년 1월 11일
Found this, albeit on the trainnetwork page and not train, but it appears to still be applicable.
"To train a network using categorical features, you must first convert the categorical features to numeric."

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Categorical Arrays에 대해 자세히 알아보기

제품


릴리스

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by