How to using Linear Discriminant Analysis for classify training and testing dataset?

Question

Yihan Ma 2019년 7월 16일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/471877-how-to-using-linear-discriminant-analysis-for-classify-training-and-testing-dataset

답변: Prantik Chatterjee 2024년 3월 21일

I have a little confused about using Linear Discriminant Analysis (LDA) algorithm for classification after reading some articles.

1. Based on my understanding, for classification, training data and testing data should be separated. When reducing the dimension by LDA, I should combine training data and testing data together to reduce dimension, or just reduce training data dimension, and use eigenvector W to map testing data to lower dimension?

2. For standardized dataset mentioned in some article, I should standardize the whole data (training and testing) together, or just standardized the training data, and use the same scale mapping the testing data?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Prantik Chatterjee 2024년 3월 21일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/471877-how-to-using-linear-discriminant-analysis-for-classify-training-and-testing-dataset#answer_1428856

1. Your second approach is the correct one. While reducing dimensionality using LDA, you should first fit the LDA model only on the training data and then use the transformation (eigenvector (W)) learned from the training data to map both the training and testing data to a lower dimension.

2. The same answer applies for data standardization as well. You should first standardize the training data and then use the standardization scale from the training data to scale the test data.

The reason for both of the above cases, is to avoid issues such as data leakage. Combining training and test data before any kind of transformation may lead to data leakage, where information from the testing data influences the model training process. This can result in overly optimistic performance metrics and a model that doesn't generalize well to unseen data.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

How to using Linear Discriminant Analysis for classify training and testing dataset?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

How to using Linear Discriminant Analysis for classify training and testing dataset?

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기