Binary GA returns floating point numbers

조회 수: 9 (최근 30일)
Fabian Hofmann
Fabian Hofmann 2024년 2월 19일
댓글: Fabian Hofmann 2024년 2월 19일
Hello,
I am trying to find optimal wavelengths in NIR spectra to perform PLS regression. I have working code but the solution includes sometimes floating point numbers. My question is now how to tell ga that only 0 and 1 are possible gene values.
Or can I simply say that any non zero value is true?
% Set seed for reproducability
rng(42);
% Download data and define arrays
url = 'https://figshare.com/ndownloader/files/1649903';
filename = 'data.xlsx';
data = readmatrix(websave(filename, url));
X = data(2:end,9:end);
y = data(2:end,1);
% Average blocks of 4 wavelengths
Xavg = mean(reshape(X, [size(X,1), size(X,2)/4, 4]), 3);
% Define the fitness function
fitness_function = @(solution) 1.0 / sqrt(mean((y - regressor(Xavg(:,logical(solution)), y)).^2));
% Define the initial population
init_pop = generate_initial_population(size(Xavg,2), 50, 25);
% Define the GA instance
options = optimoptions('ga', 'PopulationSize', 50, 'InitialPopulationMatrix', init_pop, ...
'MutationFcn', {@mutationadaptfeasible, 0.3}, 'CrossoverFcn', @crossoverscattered, ...
'EliteCount', 10, 'MaxGenerations', 100, 'UseParallel', true, 'Display', 'iter');
% Run GA
[solution, fval] = ga(fitness_function, size(Xavg,2), [], [], [], [], zeros(size(Xavg,2),1), ones(size(Xavg,2),1), [], options);
function y_pred = regressor(X, y)
% Specify parameter space
parameters_gs = 1:6;
best_mse = inf;
best_n_components = 0;
for n_components = parameters_gs
% Define PLSRegression object
[~,~,~,~,beta] = plsregress(X, y, n_components);
% Fit to data
y_pred = [ones(size(X,1),1) X] * beta;
% Calculate a final y with best choice of parameters
mse = mean((y - y_pred).^2);
if mse < best_mse
best_mse = mse;
best_n_components = n_components;
end
end
[~,~,~,~,beta] = plsregress(X, y, best_n_components);
y_pred = [ones(size(X,1),1) X] * beta;
end
function init_population = generate_initial_population(array_size, solutions_per_pop, number_of_bands)
% Starts with a boolean array of zeroes
init_population = false(solutions_per_pop, array_size);
% Define an index array the size of the spectral wavelengths
index_array = 1:array_size;
for i = 1:solutions_per_pop
% Randomly shuffle the array in place
index_array = index_array(randperm(length(index_array)));
% Select the first number_of_bands of the shuffled array and use it to flip the population array
init_population(i, index_array(1:number_of_bands)) = ~init_population(i, index_array(1:number_of_bands));
end
init_population = double(init_population);
end
Thanks for helping
F

답변 (1개)

Walter Roberson
Walter Roberson 2024년 2월 19일
% Set seed for reproducability
rng(42);
% Load data and define arrays
data = readmatrix('Data/File_S1.xlsx');
Error using readmatrix
Unable to find or open 'Data/File_S1.xlsx'. Check the path and filename or file permissions.
X = data(2:end,9:end);
y = data(2:end,1);
% Average blocks of 4 wavelengths
Xavg = mean(reshape(X, [size(X,1), size(X,2)/4, 4]), 3);
% Define the fitness function
fitness_function = @(solution) 1.0 / sqrt(mean((y - cv_regressor(Xavg(:,logical(solution)), y)).^2));
% Define the initial population
init_pop = generate_initial_population(size(Xavg,2), 50, 25);
% Define the GA instance
options = optimoptions('ga', 'PopulationSize', 50, 'InitialPopulationMatrix', init_pop, ...
'MutationFcn', {@mutationadaptfeasible, 0.3}, 'CrossoverFcn', @crossoverscattered, ...
'EliteCount', 10, 'MaxGenerations', 100, 'UseParallel', true, 'Display', 'iter', ...
'PopulationType', 'bitstring');
% Run GA
[solution, fval] = ga(fitness_function, size(Xavg,2), [], [], [], [], zeros(size(Xavg,2),1), ones(size(Xavg,2),1), [], options);
function y_pred = regressor(X, y)
% Specify parameter space
parameters_gs = 1:6;
best_mse = inf;
best_n_components = 0;
for n_components = parameters_gs
% Define PLSRegression object
[~,~,~,~,beta] = plsregress(X, y, n_components);
% Fit to data
y_pred = [ones(size(X,1),1) X] * beta;
% Calculate a final y with best choice of parameters
mse = mean((y - y_pred).^2);
if mse < best_mse
best_mse = mse;
best_n_components = n_components;
end
end
[~,~,~,~,beta] = plsregress(X, y, best_n_components);
y_pred = [ones(size(X,1),1) X] * beta;
end
function init_population = generate_initial_population(array_size, solutions_per_pop, number_of_bands)
% Starts with a boolean array of zeroes
init_population = false(solutions_per_pop, array_size);
% Define an index array the size of the spectral wavelengths
index_array = 1:array_size;
for i = 1:solutions_per_pop
% Randomly shuffle the array in place
index_array = index_array(randperm(length(index_array)));
% Select the first number_of_bands of the shuffled array and use it to flip the population array
init_population(i, index_array(1:number_of_bands)) = ~init_population(i, index_array(1:number_of_bands));
end
init_population = double(init_population);
end
  댓글 수: 1
Fabian Hofmann
Fabian Hofmann 2024년 2월 19일
Thank you for pointing out the error. The script is now running.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Problem-Based Optimization Setup에 대해 자세히 알아보기

제품


릴리스

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by