how to use ReliefF algorithm for feteare selection?

조회 수: 39 (최근 30일)
phdcomputer Eng
phdcomputer Eng 2020년 2월 15일
답변: Jingwei Too 2020년 7월 23일
I want to use ReliefF Algorithm for feature selection problem,I have a dataset (CNS.mat) I wanted to apply ReliefF Algoritm on this data and obtain the top 30 features, then apply classifier on the result of ReliefF Algorithm. I studied about how this Algorithm works in MATLAB Help:
[RANKED,WEIGHT] = relieff(X,Y,K)
[RANKED,WEIGHT] = relieff(X,Y,K,'PARAM1',val1,'PARAM2',val2,...)
and also I studied this example of ReliefF in MATLAB HELP:
load fisheriris
[ranked,weight] = relieff(meas,species,10)
ranked =
4 3 1 2
weight =
0.1399 0.1226 0.3590 0.3754
But I don't know if this code works the way I descripted, (selects top features and save them as result for classify), my aim is to apply ReliefF Algorithm as feature selection on CNS data and compare the results of this algorithm with other algorithms like SVM-RFE,InfoGain.
I'll be very gratefull your opinions how to use ReliefF for feature selection.

답변 (2개)

MeLearningProgramming
MeLearningProgramming 2020년 7월 23일
편집: MeLearningProgramming 2020년 7월 23일
Hey guy,
I am using the relieff as well. you have to watch out, how the outputs are given.
weight = 0.1399 0.1226 0.3590 0.3754
means that the first parameter in meas got the weight 0.1399 (first line = first parameter of meas)
ranked = 4 3 1 2 dosn't mean first line = first parameter of meas = ranking number 4
it means that the first parameter in meas got the ranking position 3 (position of the number 1 = first parameter)
How to use relieff?
X should a Matix with datapoint x parameter (in my case for example 147510x10) and y should be a vector datapoint x 1 (147510x1)
first you should estimate the best k-value, like this:
ParamLabels = {'P1','P2','P3','P4','P5','P6','P7','P8','P9','P10'};
for k=1:200 %or parfor
[idx,weights] = relieff(X,y,k);
RankImportanceIdx(:,k) = idx';
RankImportanceWeight(:,k) = weights';
end
by a simple plot of RankImportanceWeight you can see at which k-value the results stay equal => best k-value.
In my case, the best k value for example is 75! afterwards you could plot the results like this:
plot(RankImportanceWeight(RankImportanceIdx(1:end,75),1:end)','LineWidth',2);
title(['Relief algorithm weights vs. k-values','FontWeight','normal')
xlabel('size of k-nearest neighbor'); ylabel('weights');
legend(ParamLabels(RankImportanceIdx(1:end,75)),'Box','off');
set(gca,'FontName','Arial','FontSize',16);
and/or you could create a table, like this:
for pidx=1:size(ParamLabels,2)
[a,~] = find(strcmp(ParamLabels(RankImportanceIdx(1:end,75)),ParamLabels{pidx}));
RankImportanceTbl{pidx,:} = a;
end
by this you could chose the best 30 parameter that fits to your y.
hope this helps to adapt it to your problem,
regards,
MLP

Jingwei Too
Jingwei Too 2020년 7월 23일

카테고리

Help CenterFile Exchange에서 Display Point Clouds에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by