How to group data within a column by specific text within that column

조회 수: 7 (최근 30일)
Candice Cooper
Candice Cooper 2021년 2월 6일
댓글: Candice Cooper 2021년 3월 4일
I have a dataset of about 260,000 data points. One of the columns, "species_name'' has various species names within the column. How can I group this data by specific species names (and therefore, group the data in the other columns within the dataset (size, for example) by specific species names)?
  댓글 수: 2
Adam Danz
Adam Danz 2021년 2월 7일
Are you just trying to index the table?
load fisheriris
T = table(categorical(species), meas(:,1),meas(:,2),meas(:,3),meas(:,4));
T.Properties.VariableNames{1} = 'Species'
T = 150x5 table
Species Var2 Var3 Var4 Var5 _______ ____ ____ ____ ____ setosa 5.1 3.5 1.4 0.2 setosa 4.9 3 1.4 0.2 setosa 4.7 3.2 1.3 0.2 setosa 4.6 3.1 1.5 0.2 setosa 5 3.6 1.4 0.2 setosa 5.4 3.9 1.7 0.4 setosa 4.6 3.4 1.4 0.3 setosa 5 3.4 1.5 0.2 setosa 4.4 2.9 1.4 0.2 setosa 4.9 3.1 1.5 0.1 setosa 5.4 3.7 1.5 0.2 setosa 4.8 3.4 1.6 0.2 setosa 4.8 3 1.4 0.1 setosa 4.3 3 1.1 0.1 setosa 5.8 4 1.2 0.2 setosa 5.7 4.4 1.5 0.4
T(T.Species=='virginica',:)
ans = 50x5 table
Species Var2 Var3 Var4 Var5 _________ ____ ____ ____ ____ virginica 6.3 3.3 6 2.5 virginica 5.8 2.7 5.1 1.9 virginica 7.1 3 5.9 2.1 virginica 6.3 2.9 5.6 1.8 virginica 6.5 3 5.8 2.2 virginica 7.6 3 6.6 2.1 virginica 4.9 2.5 4.5 1.7 virginica 7.3 2.9 6.3 1.8 virginica 6.7 2.5 5.8 1.8 virginica 7.2 3.6 6.1 2.5 virginica 6.5 3.2 5.1 2 virginica 6.4 2.7 5.3 1.9 virginica 6.8 3 5.5 2.1 virginica 5.7 2.5 5 2 virginica 5.8 2.8 5.1 2.4 virginica 6.4 3.2 5.3 2.3
Candice Cooper
Candice Cooper 2021년 3월 4일
I would accept this answer, but it's a comment. This is essentially what I was trying to do.
ind = find(T.Species=='virginica'), to use your example.

댓글을 달려면 로그인하십시오.

답변 (2개)

dpb
dpb 2021년 2월 6일
A sample dataset always helps, but probably be good to convert species to a categorical variable first (although not mandatory).
Then using grouping variables -- see
doc findgroups
doc splitapply
if keeping data in an array or look at
doc rowfun
for table, timetable.
  댓글 수: 2
Candice Cooper
Candice Cooper 2021년 2월 6일
I've tried reading through those and attempting some stuff before posting this question, but I can't seem to figure it out. As an example, I have a column 'species_name' and within that column there is 'star' 'bat' 'crab' randomly dispersed throughout the column. I then have another column of 'size' that corresponds to each of those rows. I'm trying to single out, let's say, 'star' as it's own separate column and the sizes that correspond to those rows in another column.
dpb
dpb 2021년 2월 7일
Well, w/o something to work with, it's harder to guess...attach the table or .mat file with the data, or a short text listing of enough to illustrate.
Then, give us a precise definition of the problem to be solved.
Also, show us what you have tried and where you had a problem.
As I've pointed out in several related Q? recently, rarely do you really need to actually separate out the data into separate arrays; instead of duplicating data already have, use grouping variables and process as wanted.

댓글을 달려면 로그인하십시오.


dpb
dpb 2021년 2월 7일
Illustration with faked data...
tmp=categorical({'star','bat','crab'}); % the categorical variable categories
t=table(tmp(randi(3,[20,1])).',randn(20,1),'VariableNames',{'Species','Size'}); % make up some data
>> head(t) % show what first little bit looks like...
ans =
8×2 table
Species Size
_______ ________
bat -0.65863
crab -1.2834
crab 0.23872
bat 1.5475
star 0.1869
star -1.8809
crab 0.40569
bat 0.64618
>> summary(t) % summary statistics on the table
Variables:
Species: 20×1 categorical
Values:
bat 6
crab 9
star 5
Size: 20×1 double
Values:
Min -1.8809
Median 0.21281
Max 1.5967
>> rowfun(@mean,t,'GroupingVariables','Species', ...
'InputVariables','Size','OutputVariableNames','GroupMean') % group means
ans =
3×3 table
Species GroupCount GroupMean
_______ __________ _________
bat 6 0.42427
crab 9 0.10477
star 5 -0.46693
>>
Can do whatever wanted...

카테고리

Help CenterFile Exchange에서 Tables에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by