create table that count the composition of two categories.

John Verdonschot
John Verdonschot 2019년 3월 8일
댓글: John Verdonschot 2019년 3월 14일
I have a question. I have two vectors of 562 x 1. Both belongs to a dataset.
As example
column 1 - column 2
Red - circle
Blue - square
Yellow - circle
Blue - square
How can I make an overview that it counts how many red's are linket to the circle etc. like this:
Red Blue Yellow
Circle 1 0 1
Square 0 2 0
First question: What is the best way for input?
like combine them as 1 vector (562 x 2) or something else?
Second question, with the right input how can I create a nice overview of this?
Thanks a lot in advance.
With kind regard,

답변 (4개)

Bob Thompson
Bob Thompson 2019년 3월 8일
Loading things with one command is generally always preferred, so yes, combining them into a 562x2 array is helpful.
As for the command to load them, I would suggest using readtable().
Once read in you can use a second table with your column and row headers to store the counts of the different parameters. Summing them is not difficult, just use sum() and logic indexing.
nred = sum(strcmp(data(:,1),'Red') & strcmp(data(:,2),'square'));

Peter Perkins
Peter Perkins 2019년 3월 12일
This is almost an unstack operation. It may be that all you want is a nice summary output, not sure. But often this kind of thing is data pre-processing, and unstack on a tabkle might suit your needs. First thing I'd do is make those shapes and colors categoricals:
>> t = table(["R";"B";"Y";"B"],["c";"s";"c";"s"],'VariableNames',["Color" "Shape"])
t =
4×2 table
Color Shape
_____ _____
"R" "c"
"B" "s"
"Y" "c"
"B" "s"
>> t.Color = categorical(t.Color);
>> t.Shape = categorical(t.Shape);
To use unstack, you'll need to add a column of ones. It's a little awkward, but it makes unstack easy:
>> t.Ones = ones(height(t),1)
t =
4×3 table
Color Shape Ones
_____ _____ ____
R c 1
B s 1
Y c 1
B s 1
>> unstack(t,'Ones','Color','GroupingVariable','Shape')
ans =
2×4 table
Shape B R Y
_____ ___ ___ ___
c NaN 1 1
s 2 NaN NaN
Those are NaNs, not zeros, which may not be what you need. For data preprocessing, it makes some sense.

John Verdonschot
John Verdonschot 2019년 3월 11일
편집: John Verdonschot 2019년 3월 12일
Great, Thank you!
I have still an issue.
It does not count the values.
I test it with -> red = sum(strcmp(MatlabtestSetS1(:,1),'Red');
The value is 0.
I did not used readtable() because of errors. For that I used:
load('Matlabtest.mat', 'MatlabtestSetS1'); Whereby 'MatlabtestSetS1' is the 562x2 array.
Do you have an advice for this?
As additional question. But currently not important.
Is it (for optimizing) also possible, to create an array X (Red, Blue, Yollow) and an array Y (Circle; Square)
and make a double for loop
nred = sum(strcmp(MatlabtestSetS1(:,1),X) & strcmp(MatlabtestSetS1(:,2),Y));
Thank you

John Verdonschot
John Verdonschot 2019년 3월 13일
Ah oke I already expect that my data (which is text) should be strings, so should be easy to compare with other input strings. But I have to preprocess my data. If I try your example as test in my matlab, I got directly some erros.
I only copied the t = table(["R";"B";"Y";"B"],["c";"s";"c";"s"],'VariableNames',["Color" "Shape"])
I looked at the erros, but it is hard to understand(or actually to know what I have to do).
Are their maybe some setting I have to activiate or download?
"Error using matlab.internal.tabular.private.varNamesDim/validateAndAssignLabels (line 294)
The VariableNames property must be a cell array, with each element containing one nonempty
character vector.
Error in matlab.internal.tabular.private.tabularDimension/setLabels (line 173)
obj =
Error in matlab.internal.tabular.private.tabularDimension/createLike_impl (line 355)
obj = obj.setLabels(dimLabels,[]);
Error in matlab.internal.tabular.private.varNamesDim/createLike (line 70)
obj = obj.createLike_impl(dimLength,dimLabels);
Error in tabular/initInternals (line 207)
t.varDim = t.varDim.createLike(nvars,varnames); % error if invalid, duplicate, or
Error in table (line 254)
t = t.initInternals(vars, numRows, rownames, numVars, varnames);"
Peter Perkins
Peter Perkins 2019년 3월 13일
You are using a version of MATLAB prior to R2018a (IIRC) that does not support strings for the parameter names/values. change ["R"; ...] etc. to {'R'; ...} etc. and that will work.
Sorry, I'm always on the newest.
John Verdonschot
John Verdonschot 2019년 3월 14일
I installed the new version. And it works. I learned a bit more about matlab.
Thanks Bob and Peter. All responses were useful!

