over sampling method( SMOTE)

조회 수: 8 (최근 30일)
Maryam Samami
Maryam Samami 2017년 8월 14일
편집: Walter Roberson 2018년 7월 4일
Dear all, I have used SMOTE (an oversampling method for balancing data set),but after balancing, the obtained balanced data set has not the label column. the rows related to the balanced data set get increase but the label column would not increase. the main data set is 1000*25. the obtained balanced data set will be 2200*24. without label column. label column goes to "final_labels" parameter. it is 2200*1 but it contains only label 1. it must contain both labels 2 and 1 .
I will be so happy if any one would be able to guide me. any suggestion will be appreciated.
------------------------------------------------
this is my script code to balancing data set.
-----------------------------------------------------
load creditgerman.mat
a=creditgerman;
[n,m]=size(a);
total_rows=(1:n);
original_features=a(:,1:m-1);
original_mark=a(:,m);
[creditgerman_balanced_SMOTE,final_labels]=SMOTE(original_features, original_mark);
--------------------------------------------------------------------------
and this is the utilized SMOTE code.
function [final_features , final_mark] = SMOTE(original_features, original_mark)
ind = find(original_mark ==2);
% P = candidate points
P = original_features(ind ,:);
T = P';
% X = Complete Feature Vector
X = T;
% Finding the 5 positive nearest neighbours of all the positive blobs
I = nearestneighbour(T, X, 'NumberOfNeighbours', 6);
I = I';
[r, c] = size(I);
S = [];
th=0.3;
for i=1:r
for j=2:c
index = I(i,j);
new_P=P(i,:)+((P(index,:)-P(i,:))*rand);
S = [S;new_P];
end
end
original_features = [original_features;S];
[r c] = size(S);
mark = ones(r,1);
original_mark = [original_mark;mark];
train_incl = ones(length(original_mark), 1);
I = nearestneighbour(original_features', original_features', 'NumberOfNeighbours', 6);
I = I';
for j = 1:length(original_mark)
neighbors = I(j, 2:6);
len = length(find(original_mark(neighbors) ~= original_mark(j,1)));
if(len >= 2)
if(original_mark(j,1) == 1)
train_incl(neighbors(original_mark(neighbors) ~= original_mark(j,1)),1) = 0;
else
train_incl(j,1) = 0;
end
end
end
final_features = original_features(train_incl == 1, :);
final_mark = original_mark(train_incl ==1, :);
end
-----------------------------------------------------------

답변 (0개)

카테고리

Help CenterFile Exchange에서 Matrix Indexing에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by