Find variable name of categorical values repeated a number of times
조회 수: 6 (최근 30일)
이전 댓글 표시
I am trying to find the variable names of columns in a table that match a value, in my case a string. The values in my table are categorical and have values of 'p' or 'f'. I can do this for one row.
row=categorical({'p' 'f' 'p' 'f'});
t=array2table(row)
% t =
%
% 1×4 table
%
% row1 row2 row3 row4
% ____ ____ ____ ____
%
% p f p f
t.Properties.VariableNames(table2array(t(1,:))=='p')
% ans =
%
% 1×2 cell array
%
% {'row1'} {'row3'}
But I want to find when the values have been repeated a few times, three in this case.
rows=[row;row;row];
ttt=array2table(rows)
% ttt =
%
% 3×4 table
%
% rows1 rows2 rows3 rows4
% _____ _____ _____ _____
%
% p f p f
% p f p f
% p f p f
t.Properties.VariableNames(table2array(ttt(1:3,:))=={'p';'p';'p'}) %does not work
%Variable index exceeds table dimensions.
isequal(table2array(ttt(1:3,3)),{'p';'p';'p'}) %does work for one column
% ans =
%
% logical
%
% 1
Not sure what to do, also not sure if there is a better way without converting my table to an array.
댓글 수: 4
Image Analyst
2019년 12월 21일
Make it easy for us to help you, not hard. Attach the table in a .mat file with the paper clip icon so we can try things with your actual data.
답변 (2개)
dpb
2019년 12월 21일
This is one of two ways...convert to a string represenation so can use string matching functions or use logical addressing and then find locations of runs. The string function may be simplest here. Brief outline of idea...
s=reshape(char(tt.Var1),1,[]); % convert column to char() row string
fn=@(v,c) ~isempty(regexp(reshape(char(v),1,[]),[c '{3,}'])); % anon function to find 3 or more consecutive values in s
For your sample
>> fn(tt.Var1,'p')
ans =
logical
1
>> fn(tt.Var2,'p')
ans =
logical
0
>> fn(tt.Var1,'f')
ans =
logical
0
>>
To use, apply with varfun or a loop over the desired columns building logic vector by column. Then the tt.PropertiesVariableNames property with that addressing vector will return the column names.
댓글 수: 0
Image Analyst
2019년 12월 21일
Try this:
s = load('ttt.mat')
ttt = s.ttt
% Get a value to match, let's say from ttt.rows1(1)
thingToMatch = ttt.rows1(1)
for row = 1 : size(ttt, 1)
% Extract this row, just for simplicity in debugging.
thisRow = ttt{row, :}
% See how many match p
ia = ismember(thisRow, thingToMatch)
numMatches(row) = sum(ia);
end
numMatches % Echo to command window.
You'll see the number of matches:
thingToMatch =
categorical
p
numMatches =
2 2 2
댓글 수: 7
Image Analyst
2019년 12월 23일
Your "correction" does not check the very bottom row to see if they're p.
dpb
2019년 12월 23일
MG, IA is correct...his test is 3==1(3-2) & and 3==2(3-1) which tests 1,2,3 are same. Your test is only 1(3-2)=='p' & and2(3-1) =='p'
It appears you're back to the three consecutive elements...I submit the string solution I posted above is far simpler for the purpose; besides being scalable to any value m for the number instead of being specifically coded for 3 only and requiring recoding to include specific conditions if the number ever does change.
참고 항목
카테고리
Help Center 및 File Exchange에서 Categorical Arrays에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!