Replace a missing string in a table
조회 수: 16 (최근 30일)
이전 댓글 표시
I want to replace all missing strings in a table with a string of my choice, say 'unknown'. I use R2016a (without an upgrade option), so functions like fillmissing are not available to me, in case they could be of help. Eg:
dblVar = [NaN; 3; 7; 9];
cellstrVar = {'one'; 'three'; ''; 'nine'};
categoryVar = categorical({''; 'red'; 'yellow'; 'blue'});
A = table(dblVar, cellstrVar, categoryVar)
A =
dblVar cellstrVar categoryVar
______ __________ ___________
NaN 'one' <undefined>
3 'three' red
7 '' yellow
9 'nine' blue
I would like to end up with this:
A =
dblVar cellstrVar categoryVar
______ __________ ___________
NaN 'one' unknown
3 'three' red
7 'unknown' yellow
9 'nine' blue
Note I also replaced the categorical '<undefined>' as well, if you can please include in your answer.
Is there a way to do this without changing A's structure, eg from table to cell, in the process? The reason I want to avoid the transformation is my table is large, and transformation may cause memory issues.
Edit to add: the location of the missing string value has to be identified as well, there may be several such columns in the table.
Many thanks.
댓글 수: 0
채택된 답변
George
2016년 9월 28일
This should work:
dblVar = [NaN; 3; 7; 9];
cellstrVar = {'one'; 'three'; ''; 'nine'};
categoryVar = categorical({''; 'red'; 'yellow'; 'blue'});
cellstrVar2 = {'four'; 'none'; '7'; ''};
A = table(dblVar, cellstrVar, categoryVar, cellstrVar2);
varNames = A.Properties.VariableNames;
for ii = 1:numel(varNames)
if iscellstr(A{1,varNames{ii}})
undefloc = strcmp(A.(ii), '');
A{undefloc, ii} = cellstr('unknown');
end
if iscategorical(A{1, varNames{ii}})
undefloc = isundefined(A{:,ii});
A{undefloc, ii} = categorical(cellstr('unknown'));
end
end
A =
dblVar cellstrVar categoryVar cellstrVar2
______ __________ ___________ ___________
NaN 'one' unknown 'four'
3 'three' red 'none'
7 'unknown' yellow '7'
9 'nine' blue 'unknown'
You can use the undefloc variable to find where things were undefined or empty strings.
댓글 수: 4
George
2016년 9월 29일
There you go. If you do that and slam the cells and categoricals into cell arrays you can use the curly brace syntax on the cell array, rather than the variable name.
추가 답변 (1개)
Peter Perkins
2016년 10월 3일
George's loop seems fine to me although you could tweak it a bit as
for name = varNames
var = A.name;
if iscellstr(var)
var(strcmp(var,'')) = {'unknown'};
elseif iscategorical(A.(name))
var(isundefined(var) = 'unknown';
end
A.Name = var;
end
If you're willing to write a couple of small functions, you can do this:
theCellStrs = varfun(@iscellstr,A);
A(:,theCellStrs) = varfun(@replaceEmptyString,A(:,theCellStrs));
function c = replaceEmptyString(c)
c(strcmp(c,'') = {'Unknown'};
(and similarly for categorical) but varfun uses a loop underneath.
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Data Type Identification에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!