Splitapply array to fit distributions

조회 수: 1 (최근 30일)
AbioEngineer
AbioEngineer 2020년 12월 9일
편집: Cris LaPierre 2020년 12월 9일
I have an Mx1 vector and an Mx1 column of categorical variables. Is there a way to splitapply so that I can get a distribution for each data in a column according to their group? I want to do this as compactly as possible because in reality I have an MxN array and I will iterate column-wise such that I get a probability distribution for each cateogries values at each column. Like the code below:
%%Generate some data
X1 = 10 + 5 * randn(200, 1);
X2 = 20 + 8 * randn(250 ,1);
cat1=repmat("a",200,1);
cat2=repmat("b",250,1);
X = [X1; X2];
cats=[cat1:cat2]
%%Fit a distribution using a kernel smoother
myFit1 = fitdist(X1, 'kernel')
myFit2 = fitdist(X2, 'kernel')
I would like to make sure each of the fits are the same as when I do something like:
newfit=splitapply(@fitdist,X,G)
but I get the error that fitdist doesn't have enough input arguments. I'm new to anonymous functions, but I suspect I need to somehow pass 'kernel' to fitdist in splitapply. Can anyone help?

채택된 답변

Cris LaPierre
Cris LaPierre 2020년 12월 9일
편집: Cris LaPierre 2020년 12월 9일
You are close. You need to use findgroups to create your grouping variable G. Then it's just a matter of setting up your function handle correctly. Here's your code with a slight modification from me. If you compare the results, you'll see they are the same.
%%Generate some data
X1 = 10 + 5 * randn(200, 1);
X2 = 20 + 8 * randn(250 ,1);
cat1=repmat("a",200,1);
cat2=repmat("b",250,1);
X = [X1; X2];
cats=[cat1;cat2];
%%Fit a distribution using a kernel smoother
myFit1 = fitdist(X1, 'kernel')
myFit1 =
KernelDistribution Kernel = normal Bandwidth = 1.81224 Support = unbounded
myFit2 = fitdist(X2, 'kernel')
myFit2 =
KernelDistribution Kernel = normal Bandwidth = 2.91665 Support = unbounded
% Now use findgroups/splitapply
G=findgroups(cats);
newfit=splitapply(@(X)fitdist(X,'kernel'),X,G);
newfit(1)
ans =
KernelDistribution Kernel = normal Bandwidth = 1.81224 Support = unbounded
newfit(2)
ans =
KernelDistribution Kernel = normal Bandwidth = 2.91665 Support = unbounded

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Descriptive Statistics에 대해 자세히 알아보기

제품


릴리스

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by