How to assign numbers to categorical values in a dataset?

조회 수: 14 (최근 30일)
pp
pp 2020년 5월 18일
댓글: Adam Danz 2020년 5월 18일
I'm preparing a dataset for machine learning. The dataset contains a column name "Holiday". The column contains more than a million row of values. It is categorical in nature and contains 4 unique values - 0 (as a string), a, b, c.
I want to assign the values 0 to 0 and 1 to the rest of them - a, b and c. How do I do that? Is there a readymade function?

채택된 답변

Adam Danz
Adam Danz 2020년 5월 18일
편집: Adam Danz 2020년 5월 18일
If you want to return logical values,
dummyVars = Holiday ~= '0'; % Holiday is categorical
If you want to return integer values,
dummyVars = double(Holiday ~= '0'); % Holiday is categorical
Note that any value of Holiday that doesn't equal 0 will be assigned a value of 1.
  댓글 수: 4
pp
pp 2020년 5월 18일
Thanks! That did the job. Is it possible to extend this so that we can assign other numbers to a, b and c? Let's say 1, 2 and 3?
Adam Danz
Adam Danz 2020년 5월 18일
In that case, you can use
[groups, groupID] = findgroups(Holiday)
or
[groupID, groups] = grp2idx(a); % requires stats & ML toolbox

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Calendar에 대해 자세히 알아보기

제품


릴리스

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by