Why do the partial dependence plots I code myself not match the plots from the Matlab "plotPartialDependence" function?
이전 댓글 표시
In running a random forest model for a sample situation, I have not been able to replicate the partial dependence plots produced by the "plotPartialDependence" function when I try to code the partial dependence plot myself.
I start with a very simple system X1 - X2 and some noise, then run the random forest model. I then substitute the average value for the second variable (X2) for all of the rows in X2 and get the predictions. In theory, both of the lines should match up, but they are always offset by a small amount (although they always have the same shape).
I have tried this for several different sample equations and it always comes out the same. Any ideas what might be causing the offset?
range = [0:0.01:25]'; % Range of numbers
constant1 = 2; % First constant
constant2= 3.5; % Second constant
X(:,1) = range./(constant1+range); % First equation
X(:,2) = range./(constant2+range); % Second equation
for i = 1:size(range,1)
rng(i,'twister') % For reproducability
Y(i,1) = (X(i,1) - X(i,2)) + 0.1*rand(1,1); % Response variable
end
% Run the random forest model
Mdl_rf = TreeBagger(500,X,Y,'OOBPredictorImportance','on','PredictorSelection','interaction-curvature','Method','regression');
X(:,2) = mean(X(:,2)); % Substitute the mean value of Column 2 for all rows in Column 2
predictions_rf = predict(Mdl_rf,X); % Get the predictions based on the new data
figure
plotPartialDependence(Mdl_rf,1) % Plot using the partial dependence function
hold on
scatter(X(:,1),predictions_rf) % Plot the values for the first variable against the new predictions
답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Random Number Generation에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!