What does a cross (horizontal line) in the regression plot of a neural network mean with multivariant input and output?

조회 수: 3 (최근 30일)
Hello everyone,
I have trained a neural network and got the below regression plot.
First of all I have nomalized every sample, by substracting its mean value over the samples and dividing this with its standard deviation. So that all input and output is normalized and in the same range. Is that allowed, or do I introduce any errors into my data.?
I have tried several network structures and always get such a cross in the regression plot, but as I guess, some of the output data seems to be insensitive to the input data. Is that right? If so, how can I check that.
Thanks for your help! Best regards,
Pablo

채택된 답변

Cris LaPierre
Cris LaPierre 2020년 12월 2일
편집: Cris LaPierre 2020년 12월 3일
Normalizing is a standard preprocessing step. It is helpful when you have several inputs to your model that are of different scale. It helps prevent any one feature from dominating the model due to its scale. When you have a single input, this is unnecessary. Also, this is for preprocessing. I don't think it makes sense to do this after the fact, and could be affecting your visualization.
A cross would suggest there are two different types of data in your data set-one with a relationship and one without. The horizontal part indicates data points that have no relationship between the Target and the Output.
  댓글 수: 1
Pablo Noever
Pablo Noever 2020년 12월 3일
편집: Pablo Noever 2020년 12월 3일
Thank you Cris for your reply.
First of all; as I have guessed and you have confirmed the cross in the relationship is due to missing links between input and output of samples. By performing a sensitivity analysis on the normalized in and output data of the samples I have discarded every Input that does not contribute to any output parameter and every output parameter that is not affacted by any input parameter. By that all data points on the horizontal axes disapear. Thanks for that. The result is as follows:
So I get a very goog agreement correlation of target (sample output) and ANN output.
Second; I agree that normalizing the sample inputs is necessary. Put also normalizing the sample output (targets) can be helpfull so that you have all data in the same range, so that you can better estimate if the approximation of all data is good. I have performed the same without normalizing the targets, see below and you can see a good agreement (R=1). But the lowest values are in a very small range compared to the other, thus it is hard to evaluate their deviations, as this is fairly not obvious due to the scale.
Nevertheless this is just some detail. The main problem is solved. Insensitive target values can not be represented by the ANN, due to missing linkage to the input parameters and thus produce a horizontal line in the regression plots. A sensitivity analysis is a good method to discard all insensitive in and output parameters.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Get Started with Statistics and Machine Learning Toolbox에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by