필터 지우기
필터 지우기

Cleaning data for machine learning

조회 수: 8 (최근 30일)
FERNANDO CALVO RODRIGUEZ
FERNANDO CALVO RODRIGUEZ 2023년 3월 14일
댓글: FERNANDO CALVO RODRIGUEZ 2023년 3월 31일
Hey!
I am trying to clean up the missing data described as NaN for a regression using the neural network fitnet function. The thing is that these missing values for each observation I have, I don't know them and I can't remove them because I would lose the meaning. I know that in python it can be done with a pandas drop function, but in matlab I don't know how to do it without getting an error in the neural network.
If someone knows something, it would be appreciated.
  댓글 수: 3
Luca Ferro
Luca Ferro 2023년 3월 14일
편집: Luca Ferro 2023년 3월 14일
It's not quite clear to me if you want to remove the NaNs or replace them with 0s.
In any case, it would be very useful if you could share the data
FERNANDO CALVO RODRIGUEZ
FERNANDO CALVO RODRIGUEZ 2023년 3월 15일
I don't want to remove them, I just want the network to ignore them. Because if it has many NaN values the network does not work when the data it has is more than enough to find the answers.

댓글을 달려면 로그인하십시오.

채택된 답변

Vijeta
Vijeta 2023년 3월 28일
Hi Fernando,
One way to handle missing data (NaN values) in a regression problem using the fitnet function in MATLAB is to impute the missing values with some reasonable estimate before feeding the data into the neural network. There are several methods for imputing missing values, such as mean imputation, median imputation, and regression imputation.
  • A graphical user-friendly MATLAB interface is presented here: the Missing Data Imputation (MDI) Toolbox.
  • MDI Toolbox allows imputing incomplete datasets, following missing completely at random pattern.
  • Different state-of-the-art methods are included in the toolbox, such as trimmed scores regression and data augmentation
Thanks.
  댓글 수: 1
FERNANDO CALVO RODRIGUEZ
FERNANDO CALVO RODRIGUEZ 2023년 3월 31일
I don't exactly want to do that since the response variables follow a certain function that I don't know, since it is the behavior of a new material under compression. But yes, what you tell me would work for somewhat simpler networks.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Sequence and Numeric Feature Data Workflows에 대해 자세히 알아보기

제품


릴리스

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by