Creating numerical variables from categorical variables in an unbalanced dataset
조회 수: 2 (최근 30일)
이전 댓글 표시
Hello there,
I would like to apply Random Forrest method in a highly unbalanced dataset that includes both numerical and categoorical variables.In order to improve my classification results, before applying the method for classification I thought to create synthtic datasets using the SMOTE and the ADASYN algorithm. However, both methods work only with numerical variables, therefore, I would like to ask if you have any suggestion regarding the way to transform my categorical variables into numerical ones.
With many thanks in advance for your help
댓글 수: 0
채택된 답변
Lei Hou
2020년 2월 14일
Hi Grigorios,
You can do something as the following.
catVar = categorical(["a" "b" "c" "b" "a"]);
numValue = [0.1 3 100]; % The order of numbers refers to the order of categories returned by categories(catVar)
numVar = numValue(catVar)
Hoping my solution helpful to you.
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Probability Distributions에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!