Violinplot extending beyond data range

조회 수: 67 (최근 30일)
Angie
Angie 2024년 11월 28일
댓글: William Rose 2024년 12월 3일
Hello everyone,
I’m using the violinplot function in MATLAB to create violin plots for some datasets. I am specifying the position and the data as follows:
violinplot(3, data2(5:end));
However, I’ve encountered an issue. The violin plot extends to negative values even though all my data values are positive. For another dataset, I observed a similar problem: the violin plot includes values that are negative or larger than the maximum values in my data.
I’ve read that this might be caused by the kernel density estimation (KDE) method used by violinplot to calculate and visualize the data's probability density. KDE smooths the data distribution and can sometimes produce density values outside the actual range of the data.
I’m unsure how to resolve this issue and would greatly appreciate any advice or suggestions.
Thank you!
Angie

채택된 답변

William Rose
William Rose 2024년 11월 28일
편집: William Rose 2024년 11월 28일
[Edit: add ylim() so that all 3 plots have same y-axis range.]
You can vary the bandwidth, or the kernel function, or both. In the examples below, the data are uniformly distributed on (0,1), which is kind of a worst case, if you don't want the violin to extend to negative values. The violins do extend beyond the data in the examples below, but the options control by how much it extends. Experiment to see if you like the results. You may not be able to avoid the violin going negative, depending on your data.
ydata = rand(100,1);
figure;
%
subplot(131)
violinplot(ydata);
title('Default Violinplot'); ylim([-.5,1.5])
%
[f1,xf1] = kde(ydata,Bandwidth=0.05);
subplot(132)
violinplot(EvaluationPoints=xf1,DensityValues=f1)
title('Bandwidth=0.05'); ylim([-.5,1.5])
%
[f2,xf2] = kde(ydata,Kernel="box");
subplot(133)
violinplot(EvaluationPoints=xf2,DensityValues=f2)
title('Box Kernel'); ylim([-.5,1.5])
  댓글 수: 4
Angie
Angie 2024년 12월 3일
Thank you very much! As a pdf obtained with a kernel distribution extends beyond the most extreme data points in my dataset, which is something I want to avoid, I was considering using other distributions instead. Your examples have been very helpful.
William Rose
William Rose 2024년 12월 3일
@Angie, you're welcome.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Data Distribution Plots에 대해 자세히 알아보기

태그

제품


릴리스

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by