Using unbalanced data with fitlme
조회 수: 10 (최근 30일)
이전 댓글 표시
Hi,
I try to make fitlme work with unbalanced data, but I always get the error "Fixed Effects design matrix X must be of full column rank." So I looked into the code to see what the problem is and fitlme truncates my data, but retains the categorical names from the input table, which leads to a deficient rank.
In my data I have full rows, but also rows with missing data, for example [2012, 'String1', 1, NaN, 44.91, 62.9] The last column is the response column, the rest are predictors. So when I look into the fitlme function it truncates my 12042 rows input table to a 628 rows table, so that apparently every row gets deleted where at least one NaN value is present.
Shashank Prasanna talks about unbalanced data in this video, but how exactly does that work? I tried everything I could and don't know how to proceed.
댓글 수: 6
the cyclist
2023년 1월 2일
If you are OK with using only the data where you have complete rows, then you can just remove the incomplete rows yourself, before calling fitlme and the creation of the dummy variables.
If you are not OK with that, then I would again say that you need to solve your missing data problem, not your rank deficiency issue.
답변 (2개)
Sulaymon Eshkabilov
2023년 1월 2일
Suggestion. If you are not using all columns of your data then it is reaonable, you had better clean up your data (only the columns that are being used) by removing the rows where the data is missing (NaN). You can employ isnan() or ismissing() fcn to clean up your data before processing using fitlm() or fitlme(). Note that the demo video, the example data he used has exessive data points.
참고 항목
카테고리
Help Center 및 File Exchange에서 Digital Filtering에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!