필터 지우기
필터 지우기

Splitting data set using information gain

조회 수: 11 (최근 30일)
Silpa K
Silpa K 2019년 11월 8일
답변: Shishir Singhal 2020년 7월 28일
How find maximunm information gain and using this maximum information gain how can I split each row of my data set.
  댓글 수: 2
KSSV
KSSV 2019년 11월 8일
Can you elaborate? What is maximum information gain? What exactly you want to do with the data?
Silpa K
Silpa K 2019년 11월 8일
Or if it possible to do with information entropy for splitting.
clc
clear
b=zeros(36,1);
ts = xlsread('ArrowHead_TRAIN.xlsx');
l=length(ts);
for i = 1:36
p=ts(i,:);
fa = movstd(p,20,1);
secarray=movstd(fa,20,1);
k=maxk(secarray,10);
[~,ii] = min(abs(p(:) - k(:)'));
out = p(unique(ii));
end
Here I have some points in out.I need to find the information entropy based on that I need to spilt each row of my data set and checking each row that contain any of the points in out if any of the number is present I need to add only that splitted series in an excel series.

댓글을 달려면 로그인하십시오.

답변 (1개)

Shishir Singhal
Shishir Singhal 2020년 7월 28일
Hi,
I seems like you want to split your data into two sets in the basis of information gain.
I only need to decide which column of your dataset is represented as a "class" i,e. target variable and which columns are represented as features. Use function "entropyF" to calculate the entropy of each feature variable with respect to "class" variable. Function "getBestEnt" return the index of the feature having highest information gain.
Hope it helps. !!!
Thanks

카테고리

Help CenterFile Exchange에서 Database Toolbox에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by