How can I use knnimpute while having all rows of the input matrix with at least one missing value?

조회 수: 3 (최근 30일)
While trying to use knnimpute to fill in missing data, I get the following error. "All rows in the input data contain missing values. Unable to impute missing values."
It is not practical in most cases to have a feature (row in knnimpute data matrix argument) with no missing value. In the example above I would think given there are sufficient number of observations (columns) with complete values for each feature, this shouldn't cause any hiccup.
  댓글 수: 2
Larali
Larali 2017년 3월 24일
Hi Hambisa,
did you find any solution to this problem? Would be great to know, how you dealt with this issue.
Bests, Lara
kokila Mani
kokila Mani 2018년 3월 6일
HI, You can transpose the matrix and try once again. (i.e)At=A'; knnimpute(At);

댓글을 달려면 로그인하십시오.

답변 (1개)

Tim DeFreitas
Tim DeFreitas 2019년 3월 28일
This is an older question, but in case anyone comes across this answer looking for further explanation:
knnimpute only calculates distance between observation columns using rows that do not contain NaN values. This is because if NaN rows were included, the distance between columns containing NaN values would also be NaN, and there would be no way to rank the k nearest neighbors for any observation.
You could force knnimpute to replace NaN values with the average of a feature across all non-NaN obseravations by adding a "feature" row, where each observation is identical, and then removing it:
A = [ 1 2 3 4 5 7 8 NaN; 8 7 6 5 4 3 2 NaN 1; 6 5 4 3 2 1 NaN 8 7];
A(4,:) = ones(1,9);
impA = knnimpute(A)
impA(4,:) = []

카테고리

Help CenterFile Exchange에서 Data Distribution Plots에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by